Re: [PATCH v2 1/3] Allow memory operands for PTWRITE

2018-11-15 Thread Uros Bizjak
On Fri, Nov 16, 2018 at 4:57 AM Andi Kleen  wrote:
>
> From: Andi Kleen 
>
> The earlier PTWRITE builtin definition was unnecessarily restrictive,
> only allowing register input to PTWRITE. The instruction actually
> supports memory operands too, so allow that too.
>
> gcc/:
>
> 2018-11-15  Andi Kleen  
>
> * config/i386/i386.md: Allow memory operands to ptwrite.

OK.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386.md | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 44db8ac954c..9c359c0ca04 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -19501,7 +19501,7 @@
> (set_attr "prefix_extra" "2")])
>
>  (define_insn "ptwrite"
> -  [(unspec_volatile [(match_operand:SWI48 0 "register_operand" "r")]
> +  [(unspec_volatile [(match_operand:SWI48 0 "nonimmediate_operand" "rm")]
> UNSPECV_PTWRITE)]
>"TARGET_PTWRITE"
>"ptwrite\t%0"
> --
> 2.19.1
>


Re: Bug 52869 - [DR 1207] "this" not being allowed in noexcept clauses

2018-11-15 Thread Umesh Kalappa
Thank you Marek,Appreciate your valuable feedback on the patch .

Attached the latest ,please do let us know your thoughts.

~Umesh

On Thu, Nov 15, 2018 at 9:26 PM Marek Polacek  wrote:
>
> On Thu, Nov 15, 2018 at 02:26:24PM +0530, Umesh Kalappa wrote:
> > Thank you Marek  for the inputs .
> > >>In the future, if using diff, please also use the -p option.
> > We are using svn diif  and other comments are addressed .
>
> Thanks, but it doesn't seem like the -p option was used.
>
> > please let us know  your take on the  revised attached patch .
>
>
> Index: cp/ChangeLog
> ===
> --- cp/ChangeLog(revision 266026)
> +++ cp/ChangeLog(working copy)
> @@ -1,3 +1,9 @@
> +2018-11-14  Kamlesh Kumar  
> +
> +   PR c++/52869
> +   *parser.c () :  restore the old current_class_{ptr,ref} by
> +   inject_this_parameter().
> +
>
> This is still the same; can you adjust it according to my last suggestion?
>
> Index: cp/parser.c
> ===
> --- cp/parser.c (revision 266026)
> +++ cp/parser.c (working copy)
> @@ -24620,6 +24620,12 @@
> {
>   matching_parens parens;
>   parens.consume_open (parser);
> +
> + tree save_ccp = current_class_ptr;
> + tree save_ccr = current_class_ref;
> +
>
> Watch out for trailing whitespace in the blank lines.
>
> + if (current_class_type)
> +  inject_this_parameter (current_class_type, TYPE_UNQUALIFIED);
>
> I think you can remove the if here.
>
>   if (require_constexpr)
> {
> @@ -24640,6 +24646,9 @@
> }
>
>   parens.require_close (parser);
> +
> + save_ccp = current_class_ptr = save_ccp;
> + save_ccr = current_class_ref = save_ccr;
>
> You don't need to set save_cc[pr] to itself here.
>
> Index: testsuite/ChangeLog
> ===
> --- testsuite/ChangeLog (revision 266026)
> +++ testsuite/ChangeLog (working copy)
> @@ -1,3 +1,8 @@
> +2018-11-14  Kamlesh Kumar  
> +
> +   PR c++/52869
> +   * g++.dg//DRs/dr52869.C: New.
> +
>
> So DR != PR.  Please name the test dr1207.C
>
> Index: testsuite/g++.dg/DRs/dr52869.C
> ===
> --- testsuite/g++.dg/DRs/dr52869.C  (nonexistent)
> +++ testsuite/g++.dg/DRs/dr52869.C  (working copy)
> @@ -0,0 +1,22 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O0 -std=c++11" } */
>
> Instead of this, do:
>
> // { dg-do compile { target c++11 } }
>
> Also, I wrote a test that fails if current_class_{ptr,ref} aren't properly
> restored:
>
> // DR 1207
> // PR c++/52869
> // { dg-do compile { target c++11 } }
>
> void
> fn ()
> {
>   struct S {
> bool operator!() noexcept(false);
>   } s;
>   S t = s;
> }
>
> So you can add that one too, e.g. testsuite/g++.dg/DRs/dr1207-2.C.
>
> Thanks,
> Marek


pr52869.patch
Description: Binary data


Re: [PATCH] minor FDO profile related fixes

2018-11-15 Thread Indu Bhagat



On 11/12/2018 01:48 AM, Martin Liška wrote:

make check-gcc on x86_64 shows no new failures.

(A related PR washttps://gcc.gnu.org/bugzilla/show_bug.cgi?id=86957  where we 
added diagnostics for the NO PROFILE case.)

Hi.

Thanks for the patch. I'm not a maintainer, but the idea of the patch looks 
correct to me.
One question about adding "(precise)", have you verified that the patch can 
survive regression
tests?

Thanks,
Martin



make check-gcc (configured with --enable-languages=c,c++,fortran,go,lto) 
shows no new failures.

make -k check also looks ok.


[PATCH v2 2/3] Add a pass to automatically add ptwrite instrumentation

2018-11-15 Thread Andi Kleen
From: Andi Kleen 

Add a new pass to automatically instrument changes to variables
with the new PTWRITE instruction on x86. PTWRITE writes a 4 or 8 byte
field into an Processor Trace log, which allows low over head
logging of information. Essentially it's a hardware accelerated
printf.

This allows to reconstruct how values later from the log,
which can be useful for debugging or other analysis of the program
behavior. With the compiler support this can be done with without
having to manually add instrumentation to the code.

Using dwarf information this can be later mapped back to the variables.
The decoder decodes the PTWRITE instructions using IP information
in the log, and then looks up the argument in the debug information.
Then this can be used to reconstruct the original variable
name to display a value history for the variable.

There are new options to enable instrumentation for different types,
and also a new attribute to control analysis fine grained per
function or variable level. The attributes can be set on both
the variable and the type level, and also on structure fields.
This allows to enable tracing only for specific code in large
programs in a flexible matter.

The pass is generic, but only the x86 backend enables the necessary
hooks. When the backend enables the necessary hooks (with -mptwrite)
there is an additional pass that looks through the code for
attribute vartrace enabled functions or variables.

The -fvartrace=locals option is experimental: it works, but it
generates redundant ptwrites because the pass doesn't use
the SSA information to minimize instrumentation.
This could be optimized later.

Currently the code can be tested with SDE, or on a Intel
Gemini Lake system with a new enough Linux kernel (v4.10+)
that supports PTWRITE for PT. Gemini Lake is used in low
end laptops ("Intel Pentium Silver J5.. / Celeron N4... /
Celeron J4...")

Linux perf can be used to record the values

perf record -e intel_pt/ptw=1,branch=0/ program
perf script --itrace=crw -F +synth ...

I have an experimential version of perf that can also use
dwarf information to symbolize many[1] values back to their variable
names. So far it is not in standard perf, but available at

https://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git/log/?h=perf/var-resolve-4

It is currently not able to decode all variable locations to names,
but a large subset.

Longer term hopefully gdb will support this information too.

The CPU can potentially generate very data high bandwidths when
code doing a lot of computation is heavily instrumented.
This can cause some data loss in both the CPU and also in perf
logging the data when the disk cannot keep up.

Running some larger workloads most workloads do not cause
CPU level overflows, but I've seen it with -fvartrace
with crafty, and with more workloads with -fvartrace-locals.

Recommendation is to not fully instrument programs,
but only areas of interest either at the file level or using
the attributes.

The other thing is that perf and the disk often cannot keep up
with the data bandwidth for longer computations. In this case
it's possible to use perf snapshot mode (add --snapshot
to the command line above). The data will be only logged to
a memory ring buffer then, and only dump the buffers on events
of interest by sending SIGUSR2 to the perf binrary.

In the future this will be hopefully better supported with
core files and gdb.

Passes bootstrap and test suite on x86_64-linux, also
bootstrapped and tested gcc itself with full -fvartrace
and -fvartrace=locals instrumentation.

gcc/:

2018-11-15  Andi Kleen  

* Makefile.in: Add tree-vartrace.o.
* common.opt: Add -fvartrace
* opts.c (parse_vartrace_options): Add.
(common_handle_option): Call parse_vartrace_options.
* config/i386/i386.c (ix86_vartrace_func): Add.
(TARGET_VARTRACE_FUNC): Add.
* doc/extend.texi: Document vartrace/no_vartrace
attributes.
* doc/invoke.texi: Document -fvartrace.
* doc/tm.texi (TARGET_VARTRACE_FUNC): Add.
* passes.def: Add vartrace pass.
* target.def (vartrace_func): Add.
* tree-pass.h (make_pass_vartrace): Add.
* tree-vartrace.c: New file to implement vartrace pass.

gcc/c-family/:

2018-11-15  Andi Kleen  

* c-attribs.c (handle_vartrace_attribute,
  handle_no_vartrace_attribute): New functions.
  (attr_vartrace_exclusions): Add.

config/:

2018-11-03  Andi Kleen  

* bootstrap-vartrace.mk: New.
* bootstrap-vartrace-locals.mk: New.
---
 config/bootstrap-vartrace-locals.mk |   3 +
 config/bootstrap-vartrace.mk|   3 +
 gcc/Makefile.in |   1 +
 gcc/c-family/c-attribs.c|  77 +
 gcc/common.opt  |   8 +
 gcc/config/i386/i386.c  |  32 ++
 gcc/doc/extend.texi |  32 ++
 gcc/doc/invoke.texi |  27 ++
 gcc/doc/tm.texi 

[PATCH v2 3/3] Add tests for the vartrace pass

2018-11-15 Thread Andi Kleen
From: Andi Kleen 

So far they are mostly i386 target specific. Later they could
be moved up to architecture specific if some other architecture
adds vartracing.

gcc/testsuite/:

2018-11-15  Andi Kleen  

* g++.dg/vartrace-3.C: New test.
* g++.dg/vartrace-ret.C: New test.
* g++.dg/vartrace-ret2.C: New test.
* gcc.target/i386/vartrace-1.c: New test.
* gcc.target/i386/vartrace-10.c: New test.
* gcc.target/i386/vartrace-11.c: New test.
* gcc.target/i386/vartrace-12.c: New test.
* gcc.target/i386/vartrace-13.c: New test.
* gcc.target/i386/vartrace-14.c: New test.
* gcc.target/i386/vartrace-15.c: New test.
* gcc.target/i386/vartrace-16.c: New test.
* gcc.target/i386/vartrace-17.c: New test.
* gcc.target/i386/vartrace-18.c: New test.
* gcc.target/i386/vartrace-19.c: New test.
* gcc.target/i386/vartrace-20.c: New test.
* gcc.target/i386/vartrace-2.c: New test.
* gcc.target/i386/vartrace-3.c: New test.
* gcc.target/i386/vartrace-4.c: New test.
* gcc.target/i386/vartrace-5.c: New test.
* gcc.target/i386/vartrace-6.c: New test.
* gcc.target/i386/vartrace-7.c: New test.
* gcc.target/i386/vartrace-8.c: New test.
* gcc.target/i386/vartrace-9.c: New test.
---
 gcc/testsuite/g++.dg/vartrace-3.C   | 14 +++
 gcc/testsuite/g++.dg/vartrace-ret.C | 17 +
 gcc/testsuite/g++.dg/vartrace-ret2.C| 24 
 gcc/testsuite/gcc.target/i386/vartrace-1.c  | 41 +
 gcc/testsuite/gcc.target/i386/vartrace-10.c | 13 +++
 gcc/testsuite/gcc.target/i386/vartrace-11.c | 16 
 gcc/testsuite/gcc.target/i386/vartrace-12.c | 16 
 gcc/testsuite/gcc.target/i386/vartrace-13.c | 18 +
 gcc/testsuite/gcc.target/i386/vartrace-14.c | 17 +
 gcc/testsuite/gcc.target/i386/vartrace-15.c | 12 ++
 gcc/testsuite/gcc.target/i386/vartrace-16.c | 12 ++
 gcc/testsuite/gcc.target/i386/vartrace-17.c | 23 
 gcc/testsuite/gcc.target/i386/vartrace-18.c | 20 ++
 gcc/testsuite/gcc.target/i386/vartrace-19.c | 22 +++
 gcc/testsuite/gcc.target/i386/vartrace-2.c  |  9 +
 gcc/testsuite/gcc.target/i386/vartrace-20.c | 23 
 gcc/testsuite/gcc.target/i386/vartrace-3.c  |  9 +
 gcc/testsuite/gcc.target/i386/vartrace-4.c  | 13 +++
 gcc/testsuite/gcc.target/i386/vartrace-5.c  | 11 ++
 gcc/testsuite/gcc.target/i386/vartrace-6.c  | 13 +++
 gcc/testsuite/gcc.target/i386/vartrace-7.c  | 11 ++
 gcc/testsuite/gcc.target/i386/vartrace-8.c  | 11 ++
 gcc/testsuite/gcc.target/i386/vartrace-9.c  | 10 +
 gcc/testsuite/gcc.target/vartrace-19.c  | 23 
 24 files changed, 398 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/vartrace-3.C
 create mode 100644 gcc/testsuite/g++.dg/vartrace-ret.C
 create mode 100644 gcc/testsuite/g++.dg/vartrace-ret2.C
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-10.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-11.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-12.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-13.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-14.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-15.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-16.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-17.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-18.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-19.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-20.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-7.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-8.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vartrace-9.c
 create mode 100644 gcc/testsuite/gcc.target/vartrace-19.c

diff --git a/gcc/testsuite/g++.dg/vartrace-3.C 
b/gcc/testsuite/g++.dg/vartrace-3.C
new file mode 100644
index 000..217db297baa
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vartrace-3.C
@@ -0,0 +1,14 @@
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 -mptwrite -fvartrace=args " } */
+/* { dg-final { scan-assembler "ptwrite" } } */
+
+int a;
+int b(int c) 
+{
+  if (a)
+c += 1;
+  else
+c += b(a);
+  b(c);
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/vartrace-ret.C 
b/gcc/testsuite/g++.dg/vartrace-ret.C
new file mode 100644
index 000..5824e3f3738
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vartrace-ret.C
@@ -0,0 +1,17 @@
+/* 

Updated version of vartrace pass

2018-11-15 Thread Andi Kleen
First patch is x86 specific for Uros, the rest is 
the updated generic middle end implementation.

Support automatic data tracing using the PTWRITE instruction
on Intel CPUs.

This is an updated version addressing (nearly) all the
excellent review comments. It ended up being large scale changes
in the pass, so the code is quite different now. It
now instruments more cases and has many bug fixes.

Still open issues (hopefully some can be addressed in later phases):
- There are still (and even more now) redundant PTWRITEs.
- The tracing only accepts int and long, but not short
and char or bitfields.
- The vartrace pass still requires a full SSA rebuild if it changes
anything in a function. I tried to fix it, but was not successfull.
- There are some holes in tracing globals/local reads for function
arguments.
- Copies and memsets are not traced.
- The store tracing re-reads the target memory, so could suffer from
data races. This is needed for now to get the debug information of the
store right. Later this could be improved to use the input
of the store instead, but would require some changes to var tracing.

Passed boot strap and testing with PTWRITE enabled in the compiler
and also some fuzz testing with csmith.

Ok to apply now?

-Andi





[PATCH v2 1/3] Allow memory operands for PTWRITE

2018-11-15 Thread Andi Kleen
From: Andi Kleen 

The earlier PTWRITE builtin definition was unnecessarily restrictive,
only allowing register input to PTWRITE. The instruction actually
supports memory operands too, so allow that too.

gcc/:

2018-11-15  Andi Kleen  

* config/i386/i386.md: Allow memory operands to ptwrite.
---
 gcc/config/i386/i386.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 44db8ac954c..9c359c0ca04 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -19501,7 +19501,7 @@
(set_attr "prefix_extra" "2")])
 
 (define_insn "ptwrite"
-  [(unspec_volatile [(match_operand:SWI48 0 "register_operand" "r")]
+  [(unspec_volatile [(match_operand:SWI48 0 "nonimmediate_operand" "rm")]
UNSPECV_PTWRITE)]
   "TARGET_PTWRITE"
   "ptwrite\t%0"
-- 
2.19.1



[PING #4][PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)

2018-11-15 Thread Martin Sebor

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html

Please let me know if there is something I need to change here
to make the fix acceptable or if I should stop trying.

On 10/31/2018 10:33 AM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html

On 10/20/2018 06:01 PM, Martin Sebor wrote:

On 10/16/2018 03:21 PM, Jeff Law wrote:

On 10/4/18 9:51 AM, Martin Sebor wrote:

On 10/04/2018 08:58 AM, Jeff Law wrote:

On 8/27/18 9:42 AM, Richard Biener wrote:

On Mon, Aug 27, 2018 at 5:32 PM Jeff Law  wrote:


On 08/27/2018 02:29 AM, Richard Biener wrote:

On Sun, Aug 26, 2018 at 7:26 AM Jeff Law  wrote:


On 08/24/2018 09:58 AM, Martin Sebor wrote:

The warning suppression for -Wstringop-truncation looks for
the next statement after a truncating strncpy to see if it
adds a terminating nul.  This only works when the next
statement can be reached using the Gimple statement iterator
which isn't until after gimplification.  As a result, strncpy
calls that truncate their constant argument that are being
folded to memcpy this early get diagnosed even if they are
followed by the nul assignment:

  const char s[] = "12345";
  char d[3];

  void f (void)
  {
strncpy (d, s, sizeof d - 1);   // -Wstringop-truncation
d[sizeof d - 1] = 0;
  }

To avoid the warning I propose to defer folding strncpy to
memcpy until the pointer to the basic block the strnpy call
is in can be used to try to reach the next statement (this
happens as early as ccp1).  I'm aware of the preference to
fold things early but in the case of strncpy (a relatively
rarely used function that is often misused), getting
the warning right while folding a bit later but still fairly
early on seems like a reasonable compromise.  I fear that
otherwise, the false positives will drive users to adopt
other unsafe solutions (like memcpy) where these kinds of
bugs cannot be as readily detected.

Tested on x86_64-linux.

Martin

PS There still are outstanding cases where the warning can
be avoided.  I xfailed them in the test for now but will
still try to get them to work for GCC 9.

gcc-87028.diff


PR tree-optimization/87028 - false positive -Wstringop-truncation
strncpy with global variable source string
gcc/ChangeLog:

  PR tree-optimization/87028
  * gimple-fold.c (gimple_fold_builtin_strncpy): Avoid
folding when
  statement doesn't belong to a basic block.
  * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Handle
MEM_REF on
  the left hand side of assignment.

gcc/testsuite/ChangeLog:

  PR tree-optimization/87028
  * c-c++-common/Wstringop-truncation.c: Remove xfails.
  * gcc.dg/Wstringop-truncation-5.c: New test.

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 07341eb..284c2fb 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1702,6 +1702,11 @@ gimple_fold_builtin_strncpy
(gimple_stmt_iterator *gsi,
   if (tree_int_cst_lt (ssize, len))
 return false;

+  /* Defer warning (and folding) until the next statement in the
basic
+ block is reachable.  */
+  if (!gimple_bb (stmt))
+return false;

I think you want cfun->cfg as the test here.  They should be
equivalent
in practice.


Please do not add 'cfun' references.  Note that the next stmt is
also accessible
when there is no CFG.  I guess the issue is that we fold this
during
gimplification where the next stmt is not yet "there" (but still in
GENERIC)?

That was my assumption.  I almost suggested peeking at gsi_next and
avoiding in that case.


So I'd rather add guards to maybe_fold_stmt in the gimplifier then.

So I think the concern with adding the guards to maybe_fold_stmt is
the
possibility of further fallout.

I guess they could be written to target this case specifically to
minimize fallout, but that feels like we're doing the same thing
(band-aid) just in a different place.







We generally do not want to have unfolded stmts in the IL when we
can avoid that
which is why we fold most stmts during gimplification.  We also do
that because
we now do less folding on GENERIC.

But an unfolded call in the IL should always be safe and we've got
plenty of opportunities to fold it later.


Well - we do.  The very first one is forwprop though which means
we'll miss to
re-write some memcpy parts into SSA:

  NEXT_PASS (pass_ccp, false /* nonzero_p */);
  /* After CCP we rewrite no longer addressed locals into SSA
 form if possible.  */
  NEXT_PASS (pass_forwprop);

likewise early object-size will be confused by memcpy calls that just
exist
to avoid TBAA issues (another of our recommendations besides using
unions).

We do fold mem* early for a reason ;)

"We can always do warnings earlier" would be a similar true sentence.

I'm not disagreeing at all.  There's a natural tension between the
benefits of folding early to enable more optimizations downstream and
leaving the IL in a state where we can give actionable warnings.


Similar trade-offs between folding early and 

[PING #4] [PATCH] look harder for MEM_REF operand equality to avoid -Wstringop-truncation (PR 84561)

2018-11-15 Thread Martin Sebor

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01934.html

Do I need to change something in this patch to make it acceptable
or should I assume we will leave this bug in GCC unfixed?

On 10/31/2018 10:35 AM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01934.html

On 10/08/2018 03:40 PM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01934.html

On 10/01/2018 03:30 PM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01934.html

We have discussed a number of different approaches to moving
the warning somewhere else but none is feasible in the limited
amount of time remaining in stage 1 of GCC 9.  I'd like to
avoid the false positive in GCC 9 by using the originally
submitted, simple approach and look into the suggested design
changes for GCC 10.

On 09/21/2018 08:36 AM, Martin Sebor wrote:

On 09/20/2018 03:06 AM, Richard Biener wrote:

On Wed, Sep 19, 2018 at 4:19 PM Martin Sebor  wrote:


On 09/18/2018 10:23 PM, Jeff Law wrote:

On 9/18/18 1:46 PM, Martin Sebor wrote:

On 09/18/2018 12:58 PM, Jeff Law wrote:

On 9/18/18 11:12 AM, Martin Sebor wrote:


My bad.  Sigh. CCP doesn't track copies, just constants, so
there's not
going to be any data structure you can exploit.  And I don't
think
there's a value number you can use to determine the two objects
are the
same.

Hmm, let's back up a bit, what is does the relevant part of the
IL look
like before CCP?  Is the real problem here that we have
unpropagated
copies lying around in the IL?  Hmm, more likely the IL looksl
ike:

   _8 = _3(D)->a;
   _9 = _8;
   _1 = _9;
   strncpy (MEM_REF (_3(D)->a), ...);
   MEM[(struct S *)_1].a[n_7] = 0;


Yes, that is what the folder sees while the strncpy call is
being transformed/folded by ccp.  The MEM_REF is folded just
after the strncpy call and that's when it's transformed into

  MEM[(struct S *)_8].a[n_7] = 0;

(The assignments to _1 and _9 don't get removed until after
the dom walk finishes).



If we were to propagate the copies out we'd at best have:

   _8 = _3(D)->a;
   strncpy (MEM_REF (_3(D)->a), ...);
   MEM[(struct S *)_8].a[n_7] = 0;


Is that in a form you can handle?  Or would we also need to
forward
propagate the address computation into the use of _8?


The above works as long as we look at the def_stmt of _8 in
the MEM_REF (we currently don't).  That's also what the last
iteration of the loop does.  In this case (with _8) it would
be discovered in the first iteration, so the loop could be
replaced by a simple if statement.

But I'm not sure I understand the concern with the loop.  Is
it that we are looping at all, i.e., the cost?  Or that ccp
is doing something wrong or suboptimal? (Should have
propagated the value of _8 earlier?)

I suspect it's more a concern that things like copies are
typically
propagated away.   So their existence in the IL (and consequently
your
need to handle them) raises the question "has something else
failed to
do its job earlier".

During which of the CCP passes is this happening?  Can we pull the
warning out of the folder (even if that means having a distinct
warning
pass over the IL?)


It happens during the third run of the pass.

The only way to do what you suggest that I could think of is
to defer the strncpy to memcpy transformation until after
the warning pass.  That was also my earlier suggestion: defer
both it and the warning until the tree-ssa-strlen pass (where
the warning is implemented to begin with -- the folder calls
into it).

If it's happening that late (CCP3) in general, then ISTM we ought
to be
able to get the warning out of the folder.  We just have to pick the
right spot.

warn_restrict runs before fold_all_builtins, but after dom/vrp so we
should have the IL in pretty good shape.  That seems like about the
right time.

I wonder if we could generalize warn_restrict to be a more generic
warning pass over the IL and place it right before fold_builtins.


The restrict pass doesn't know about string lengths so it can't
handle all the warnings about string built-ins (the strlen pass
now calls into it to issue some).  The strlen pass does so it
could handle most if not all of them (the folder also calls
into it to issue some warnings).  It would work even better if
it were also integrated with the object size pass.

We're already working on merging strlen with sprintf.  It seems
to me that the strlen pass would benefit not only from that but
also from integrating with object size and warn-restrict.  With
that, -Wstringop-overflow could be moved from builtins.c into
it as well (and also benefit not only from accurate string
lengths but also from the more accurate object size info).

What do you think about that?


I think integrating the various "passes" (objectsize is also
as much a facility as a pass) generally makes sense given
it might end up improving all of them and reduce code duplication.


Okay.  If Jeff agrees I'll see if I can make it happen for GCC
10.  Until then, 

PING [PATCH] add simple attribute introspection

2018-11-15 Thread Martin Sebor

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01473.html
(Still looking for an approval.)

On 11/09/2018 04:43 PM, Martin Sebor wrote:

On 11/09/2018 12:59 PM, Jeff Law wrote:

On 10/31/18 10:27 AM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01473.html

With the C++ bits approved I'm still looking for a review/approval
of the remaining changes: the C front end and the shared c-family
handling of the new built-in.

I thought I acked those with just a couple minor doc nits:


I don't see a formal approval for the rest in my Inbox or in
the archive.


diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 8ffb0cd..dcf4747 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2649,8 +2649,9 @@ explicit @code{externally_visible} attributes

are still necessary.

 @cindex @code{flatten} function attribute
 Generally, inlining into a function is limited.  For a function

marked with

 this attribute, every call inside this function is inlined, if
possible.
-Whether the function itself is considered for inlining depends on its

size and

-the current inlining parameters.
+Functions declared with attribute @code{noinline} and similar are not
+inlined.  Whether the function itself is considered for inlining
depends
+on its size and the current inlining parameters.

Guessing this was from another doc patch that I think has already been
approved


Yes.  It shouldn't be in the latest patch at the link above.


@@ -11726,6 +11728,33 @@ check its compatibility with @var{size}.

 @end deftypefn

+@deftypefn {Built-in Function} bool __builtin_has_attribute

(@var{type-or-expression}, @var{attribute})

+The @code{__builtin_has_attribute} function evaluates to an integer

constant

+expression equal to @code{true} if the symbol or type referenced by
+the @var{type-or-expression} argument has been declared with
+the @var{attribute} referenced by the second argument.  Neither
argument
+is valuated.  The @var{type-or-expression} argument is subject to the

same
s/valuated/evaluated/ ?


This should also be fixed in the latest patch at the link above.


Did the implementation change significantly requiring another review
iteration?


I don't think it changed too significantly between the last two
revisions but I don't have a record of anyone having approved
the C FE and the middle-end bits.  (Sorry if I missed it.) Other
than this response from you all I see in the archive is this:

  https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00606.html

Please let me if the last revision is okay to commit.

Thanks
Martin




[PATCH, csky] Update linux-unwind.h for kernel

2018-11-15 Thread 瞿仙淼
Hi, 
I have submitted a patch to update linux-unwind for kernel


Index: libgcc/ChangeLog
===
--- libgcc/ChangeLog(revision 266199)
+++ libgcc/ChangeLog(working copy)
@@ -1,5 +1,11 @@
 2018-11-15  Xianmiao Qu  
 
+   * config/csky/linux-unwind.h (sc_pt_regs): Update for kernel. 
+   (sc_pt_regs_lr): Update for kernel.
+   (sc_pt_regs_tls): Update for kernel.
+
+2018-11-15  Xianmiao Qu  
+
* config/csky/linux-unwind.h: Fix coding style.
 
 2018-11-13  Xianmiao Qu  
Index: libgcc/config/csky/linux-unwind.h
===
--- libgcc/config/csky/linux-unwind.h   (revision 266199)
+++ libgcc/config/csky/linux-unwind.h   (working copy)
@@ -47,9 +47,9 @@
 #define MOVI_R7_139_V2_PART0 0xea07
 #define MOVI_R7_139_V2_PART1 139
 
-#define sc_pt_regs(x) (sc->sc_##x)
-#define sc_pt_regs_lr (sc->sc_r15)
-#define sc_pt_regs_tls(x) (sc->sc_exregs[15])
+#define sc_pt_regs(x) (sc->sc_pt_regs.x)
+#define sc_pt_regs_lr (sc->sc_pt_regs.lr)
+#define sc_pt_regs_tls(x) sc_pt_regs(x)
 
 #define MD_FALLBACK_FRAME_STATE_FOR csky_fallback_frame_state





Re: [PATCH] RFC: C/C++: print help when a header can't be found

2018-11-15 Thread Eric Gallager
On 11/12/18, Martin Sebor  wrote:
> On 11/11/2018 04:33 PM, David Malcolm wrote:
>> When gcc can't find a header file, it's a hard error that stops the
>> build,
>> typically requiring the user to mess around with compile flags,
>> Makefiles,
>> dependencies, and so forth.
>>
>> Often the exact search paths aren't obvious to the user.  Consider the
>> case where the include paths are injected via a tool such as pkg-config,
>> such as e.g.:
>>
>>   gcc $(pkg-config --cflags glib-2.0) demo.c
>>
>> This patch is an attempt at being more helpful for such cases.  Given
>> that
>> the user can't proceed until the issue is resolved, I think it's
>> reasonable
>> to default to telling the user as much as possible about what happened.
>> This patch list all of the search paths, and any close matches (e.g. for
>> misspellings).
>>
>> Without the patch, the current behavior is:
>>
>> misspelled-header-1.c:1:10: fatal error: test-header.hpp: No such file or
>> directory
>> 1 | #include "test-header.hpp"
>>   |  ^
>> compilation terminated.
>>
>> With the patch, the user gets this output:
>>
>> misspelled-header-1.c:1:10: fatal error: test-header.hpp: No such file or
>> directory
>> 1 | #include "test-header.hpp"
>>   |  ^
>> misspelled-header-1.c:1:10: note: paths searched:
>> misspelled-header-1.c:1:10: note:  path: ''
>> misspelled-header-1.c:1:10: note:   not found: 'test-header.hpp'
>> misspelled-header-1.c:1:10: note:   close match: 'test-header.h'
>> 1 | #include "test-header.hpp"
>>   |  ^
>>   |  "test-header.h"
>> misspelled-header-1.c:1:10: note:  path: '/usr/include/glib-2.0' (via
>> '-I')
>> misspelled-header-1.c:1:10: note:   not found:
>> '/usr/include/glib-2.0/test-header.hpp'
>> misspelled-header-1.c:1:10: note:  path: '/usr/lib64/glib-2.0/include'
>> (via '-I')
>> misspelled-header-1.c:1:10: note:   not found:
>> '/usr/lib64/glib-2.0/include/test-header.hpp'
>> misspelled-header-1.c:1:10: note:  path: './include' (system directory)
>> misspelled-header-1.c:1:10: note:   not found:
>> './include/test-header.hpp'
>> misspelled-header-1.c:1:10: note:  path: './include-fixed' (system
>> directory)
>> misspelled-header-1.c:1:10: note:   not found:
>> './include-fixed/test-header.hpp'
>> misspelled-header-1.c:1:10: note:  path: '/usr/local/include' (system
>> directory)
>> misspelled-header-1.c:1:10: note:   not found:
>> '/usr/local/include/test-header.hpp'
>> misspelled-header-1.c:1:10: note:  path: '/usr/include' (system
>> directory)
>> misspelled-header-1.c:1:10: note:   not found:
>> '/usr/include/test-header.hpp'
>> compilation terminated.
>>
>> showing the paths that were tried, and why (e.g. the -I paths injected by
>> the pkg-config invocation), and the .hpp vs .h issue (with a fix-it
>> hint).
>>
>> It's verbose, but as I said above, the user can't proceed until they
>> resolve it, so I think being verbose is appropriate here.
>>
>> Thoughts?
>
> I think printing the directories and especially the near matches
> will be very helpful, especially for big projects with lots of -I
> options.
>
> The output could be made substantially shorter, less repetitive,
> and so easier to read -- basically cut in half -- by avoiding
> most of the duplication and collapsing two notes into one, e.g.
> like so:
>
>fatal error: test-header.hpp: No such file or directory
>1 | #include "test-header.hpp"
>  |  ^
>note: paths searched:
>note: -I '.'
>note:   close match: 'test-header.h'
>1 | #include "test-header.hpp"
>  |  ^
>  |  "test-header.h"
>note: -I '/usr/include/glib-2.0'
>note: -I '/usr/lib64/glib-2.0/include'
>note: -isystem './include'
>note: -isystem './include-fixed'
>note: -isystem '/usr/local/include'
>note: -isystem '/usr/include'
>
> or by printing the directories in sections:
>
>note: -I paths searched:
>note:   '.'
>note:   close match: 'test-header.h'
>1 | #include "test-header.hpp"
>  |  ^
>  |  "test-header.h"
>note:   '/usr/include/glib-2.0'
>note:   '/usr/lib64/glib-2.0/include'
>note: -isystem paths searched:
>note:   './include'
>note:   './include-fixed'
>note:   '/usr/local/include'
>note:   '/usr/include'
>
> Martin
>
>

How would this interact with -Wmissing-include-dirs?


[doc, committed] clarify that "packed" attribute can apply to C++ classes

2018-11-15 Thread Sandra Loosemore
I've checked in this patch for PR 25759, another minor documentation 
improvement.


-Sandra
2018-11-15  Sandra Loosemore  

	PR c++/25759

	gcc/
	* doc/extend.texi (Common Type Attributes): Make it explicit
	that attribute "packed" can apply to C++ classes.
Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi	(revision 266195)
+++ gcc/doc/extend.texi	(working copy)
@@ -7314,17 +7314,16 @@ or @code{__pointer__} for the mode used
 
 @item packed
 @cindex @code{packed} type attribute
-This attribute, attached to @code{struct} or @code{union} type
-definition, specifies that each member (other than zero-width bit-fields)
-of the structure or union is placed to minimize the memory required.  When
-attached to an @code{enum} definition, it indicates that the smallest
-integral type should be used.
+This attribute, attached to @code{struct}, @code{union}, or C++ @code{class}
+type definition, specifies that each of its members (other than zero-width
+bit-fields) is placed to minimize the memory required.  This is equivalent
+to specifying the @code{packed} attribute on each of the members.
 
 @opindex fshort-enums
-Specifying the @code{packed} attribute for @code{struct} and @code{union}
-types is equivalent to specifying the @code{packed} attribute on each
-of the structure or union members.  Specifying the @option{-fshort-enums}
-flag on the command line is equivalent to specifying the @code{packed}
+When attached to an @code{enum} definition, the @code{packed} attribute
+indicates that the smallest integral type should be used.
+Specifying the @option{-fshort-enums} flag on the command line
+is equivalent to specifying the @code{packed}
 attribute on all @code{enum} definitions.
 
 In the following example @code{struct my_packed_struct}'s members are
@@ -7348,8 +7347,9 @@ struct __attribute__ ((__packed__)) my_p
 @end smallexample
 
 You may only specify the @code{packed} attribute attribute on the definition
-of an @code{enum}, @code{struct} or @code{union}, not on a @code{typedef}
-that does not also define the enumerated type, structure or union.
+of an @code{enum}, @code{struct}, @code{union}, or @code{class}, 
+not on a @code{typedef} that does not also define the enumerated type,
+structure, union, or class.
 
 @item scalar_storage_order ("@var{endianness}")
 @cindex @code{scalar_storage_order} type attribute


Re: [PATCH] diagnose unsupported uses of hardware register variables (PR 88000)

2018-11-15 Thread Martin Sebor

On 11/14/2018 02:39 AM, Alexander Monakov wrote:

On Tue, 13 Nov 2018, Martin Sebor wrote:


In PR 88000 the reporter expects to be able to use an explicit
register local variable in a context where it isn't supported
i.e., for something other than an input or output of an asm
statement: namely to pass it as argument to a user-defined
function.  GCC emits unexpected object code in this case which
the reporter thought was a GCC bug.


I appreciate warnings for misuse of extensions in general, but in
this particular case I think GCC's behavior is misdesigned, so
instead of enshrining a bad engineering choice in a warning and
in the manual, I'd rather see GCC implement the extension sanely.

Merely changing a normal auto variable to a register asm variable
should not invalidate the program. As the manual says, it should
amount to providing a hint to the register allocator, at most.

Have a look at PR 87984, where code is miscompiled despite following
the documentation to the letter. This is because we lower accesses to
register variables when transitioning from GIMPLE to RTL incorrectly.
Fixing that should make the warning unnecessary. I hope I can work on
that before stage 4.

I think LLVM is doing the right thing there, and so should we.


That would indeed be preferable but it's not something I expect
to have the cycles to work on.  I put the patch together only
because it seemed like an easy way to help keep users from
shooting themselves in the foot (to borrow the phrase from
the comment in PR 88000).

If there's consensus that the general use case of passing hard
registers to functions should be supported then I suggest to
acknowledge PR 88000 as a bug.  We can discuss whether it's
possible and worthwhile to warn on the unsupported subset of
the use cases.  In any case, I think the least we can for now
is to clarify in the manual what that supported subset is,
since as I (and others) read it, passing hard registers to
functions is not.

Martin


[PATCH] v2: C/C++: add fix-it hints for missing '&' and '*' (PR c++/87850)

2018-11-15 Thread David Malcolm
On Tue, 2018-11-13 at 16:34 -0500, Jason Merrill wrote:
> On Mon, Nov 12, 2018 at 4:32 PM Martin Sebor 
> wrote:
> > On 11/11/2018 02:02 PM, David Malcolm wrote:
> > > On Sun, 2018-11-11 at 11:01 -0700, Martin Sebor wrote:
> > > > On 11/10/2018 12:01 AM, Eric Gallager wrote:
> > > > > On 11/9/18, David Malcolm  wrote:
> > > > > > This patch adds a fix-it hint to various pointer-vs-non-
> > > > > > pointer
> > > > > > diagnostics, suggesting the addition of a leading '&' or
> > > > > > '*'.
> > > > > > 
> > > > > > For example, note the ampersand fix-it hint in the
> > > > > > following:
> > > > > > 
> > > > > > demo.c:5:22: error: invalid conversion from 'pthread_key_t'
> > > > > > {aka
> > > > > > 'unsigned
> > > > > > int'}
> > > > > >to 'pthread_key_t*' {aka 'unsigned int*'} [-fpermissive]
> > > > > > 5 |   pthread_key_create(key, NULL);
> > > > > >   |  ^~~
> > > > > >   |  |
> > > > > >   |  pthread_key_t {aka unsigned
> > > > > > int}
> > > > > >   |  &
> > > > > 
> > > > > Having both the type and the fixit underneath the caret looks
> > > > > kind
> > > > > of confusing
> > > > 
> > > > I agree it's rather subtle.  Keeping the diagnostics separate
> > > > from
> > > > the suggested fix should avoid the confusion.
> > > 
> > > FWIW, the fix-it hint is in a different color (assuming that gcc
> > > is
> > > invoked in an environment that prints that...)
> > 
> > I figured it would be, but I'm still not sure it's good design
> > to be relying on color alone to distinguish between the problem
> > and the suggested fix.  Especially when they are so close to one
> > another and the fix is just a single character with no obvious
> > relationship to the rest of the text on the screen.  In other
> > warnings there's at least the "did you forget the '@'?" part
> > to give a clue, even though even there the connection between
> > the "did you forget" and the & several lines down wouldn't
> > necessarily be immediately apparent.
> 
> Agreed, something along those lines would help to understand why the
> compiler is throwing a random & into the diagnostic.
> 
> Jason

Here's an updated version which adds a note, putting the fix-it hint
on that instead (I attempted adding the text to the initial error,
but there was something of a combinatorial explosion of messages).

The above example becomes:

demo.c: In function 'int main()':
demo.c:5:22: error: invalid conversion from 'pthread_key_t' {aka 'unsigned int'}
   to 'pthread_key_t*' {aka 'unsigned int*'} [-fpermissive]
5 |   pthread_key_create(key, NULL);
  |  ^~~
  |  |
  |  pthread_key_t {aka unsigned int}
demo.c:5:22: note: possible fix: take the address with '&'
5 |   pthread_key_create(key, NULL);
  |  ^~~
  |  &
In file included from demo.c:1:
/usr/include/pthread.h:1122:47: note:   initializing argument 1 of
   'int pthread_key_create(pthread_key_t*, void (*)(void*))'
 1122 | extern int pthread_key_create (pthread_key_t *__key,
  |~~~^

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/c-family/ChangeLog:
PR c++/87850
* c-common.c: Include "gcc-rich-location.h".
(maybe_emit_indirection_note): New function.
* c-common.h (maybe_emit_indirection_note): New decl.

gcc/c/ChangeLog:
PR c++/87850
* c-typeck.c (convert_for_assignment): Call
maybe_emit_indirection_note for pointer vs non-pointer
diagnostics.

gcc/cp/ChangeLog:
PR c++/87850
* call.c (convert_like_real): Call
maybe_emit_indirection_note for "invalid conversion" diagnostic.

gcc/testsuite/ChangeLog:
PR c++/87850
* c-c++-common/indirection-fixits.c: New test.
---
 gcc/c-family/c-common.c |  33 +++
 gcc/c-family/c-common.h |   2 +
 gcc/c/c-typeck.c|  10 +-
 gcc/cp/call.c   |   2 +
 gcc/testsuite/c-c++-common/indirection-fixits.c | 270 
 5 files changed, 315 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/indirection-fixits.c

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index cd88f3a..d5c7c7f 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimplify.h"
 #include "substring-locations.h"
 #include "spellcheck.h"
+#include "gcc-rich-location.h"
 #include "selftest.h"
 
 cpp_reader *parse_in;  /* Declared in c-pragma.h.  */
@@ -8405,6 +8406,38 @@ maybe_suggest_missing_token_insertion (rich_location 
*richloc,
 }
 }
 
+/* Potentially emit a note about likely missing '&' or '*',
+   depending on EXPR and 

Re: [PATCH] Support simd function declarations via a pre-include. (was: [PATCH][RFC]Overloading intrinsics)

2018-11-15 Thread Jakub Jelinek
On Thu, Nov 15, 2018 at 08:40:13PM +0100, Bernhard Reutner-Fischer wrote:
> On 14 November 2018 12:35:27 CET, Jakub Jelinek  wrote:
> 
> >> --- a/gcc/config/gnu-user.h
> >> +++ b/gcc/config/gnu-user.h
> >> @@ -170,3 +170,6 @@ see the files COPYING3 and COPYING.RUNTIME
> >respectively.  If not, see
> >>LD_STATIC_OPTION " --whole-archive -llsan --no-whole-archive " \
> >>LD_DYNAMIC_OPTION "}}%{!static-liblsan:-llsan}"
> >>  #endif
> >> +
> >> +#undef TARGET_F951_NOSTDINC_OPTIONS
> >> +#define TARGET_F951_NOSTDINC_OPTIONS
> >"%:fortran-header-file(-fpre-include= math-vector-fortran.h)"
> >
> >Too long line, use some \s to split it up.
> >
> 
> Can we use plain -include like in C?

Wouldn't that be confusing whether it is included in preprocessor only or if
it is included as a magic fortran include line at the beginning?

Jakub


Re: [PATCH][rs6000] inline expansion of memcmp using vsx

2018-11-15 Thread Aaron Sawdey
On 11/15/18 4:02 AM, Richard Biener wrote:
> On Wed, Nov 14, 2018 at 5:43 PM Aaron Sawdey  wrote:
>>
>> This patch generalizes some the functions added earlier to do vsx expansion 
>> of strncmp
>> so that the can also generate the code needed for memcmp. I reorganized
>> expand_block_compare() a little to be able to make use of this there. The 
>> vsx code is more
>> compact so I've changed the default block compare inline limit to 63 bytes. 
>> The vsx
>> code is only used if there is at least 16 bytes to compare as this means we 
>> don't have to
>> do complex code to compare less than one chunk. If vsx is not available the 
>> limit is cut
>> in half. The performance is good, vsx memcmp is considerably faster than the 
>> gpr inline code
>> if the strings are equal and is comparable if the strings have a 10% chance 
>> of being
>> equal (spread across the string).
> 
> How is performance affected if there are close earlier char-size
> stores to one of the string/memory?
> Can power still do store forwarding in this case?

Store forwarding between scalar and vector is not great, but it's
better than having to make a plt call to memcmp() which may well use
vsx anyway. I had set the crossover between scalar and vsx at 16 bytes
because the vsx code is more compact. The performance is similar for
16-32 byte sizes. But you could make an argument for switching at 33
bytes. This way builtin memcmp of 33-64 bytes would now use inline vsx
code instead of memcmp() call. At 33 bytes the vsx inline code is 3x
faster than a memcmp() call so would likely remain faster even if
there was an ugly vector-load-hit-scalar-store. Also small structures
32 bytes and less being compared would use scalar code and the same as
gcc 8 and would avoid this issue.

  Aaron

> 
>> Currently regtesting, ok for trunk if tests pass?
>>
>> Thanks!
>>Aaron
>>
>> 2018-11-14  Aaron Sawdey  
>>
>> * config/rs6000/rs6000-string.c (emit_vsx_zero_reg): New function.
>> (expand_cmp_vec_sequence): Rename and modify
>> expand_strncmp_vec_sequence.
>> (emit_final_compare_vec): Rename and modify 
>> emit_final_str_compare_vec.
>> (generate_6432_conversion): New function.
>> (expand_block_compare): Add support for vsx.
>> (expand_block_compare_gpr): New function.
>> * config/rs6000/rs6000.opt (rs6000_block_compare_inline_limit): 
>> Increase
>> default limit to 63 because of more compact vsx code.
>>
>>
>>
>>
>> Index: gcc/config/rs6000/rs6000-string.c
>> ===
>> --- gcc/config/rs6000/rs6000-string.c   (revision 266034)
>> +++ gcc/config/rs6000/rs6000-string.c   (working copy)
>> @@ -615,6 +615,283 @@
>>  }
>>  }
>>
>> +static rtx
>> +emit_vsx_zero_reg()
>> +{
>> +  unsigned int i;
>> +  rtx zr[16];
>> +  for (i = 0; i < 16; i++)
>> +zr[i] = GEN_INT (0);
>> +  rtvec zv = gen_rtvec_v (16, zr);
>> +  rtx zero_reg = gen_reg_rtx (V16QImode);
>> +  rs6000_expand_vector_init (zero_reg, gen_rtx_PARALLEL (V16QImode, zv));
>> +  return zero_reg;
>> +}
>> +
>> +/* Generate the sequence of compares for strcmp/strncmp using vec/vsx
>> +   instructions.
>> +
>> +   BYTES_TO_COMPARE is the number of bytes to be compared.
>> +   ORIG_SRC1 is the unmodified rtx for the first string.
>> +   ORIG_SRC2 is the unmodified rtx for the second string.
>> +   S1ADDR is the register to use for the base address of the first string.
>> +   S2ADDR is the register to use for the base address of the second string.
>> +   OFF_REG is the register to use for the string offset for loads.
>> +   S1DATA is the register for loading the first string.
>> +   S2DATA is the register for loading the second string.
>> +   VEC_RESULT is the rtx for the vector result indicating the byte 
>> difference.
>> +   EQUALITY_COMPARE_REST is a flag to indicate we need to make a cleanup 
>> call
>> +   to strcmp/strncmp if we have equality at the end of the inline 
>> comparison.
>> +   P_CLEANUP_LABEL is a pointer to rtx for a label we generate if we need 
>> code
>> +   to clean up and generate the final comparison result.
>> +   FINAL_MOVE_LABEL is rtx for a label we can branch to when we can just
>> +   set the final result.
>> +   CHECKZERO indicates whether the sequence should check for zero bytes
>> +   for use doing strncmp, or not (for use doing memcmp).  */
>> +static void
>> +expand_cmp_vec_sequence (unsigned HOST_WIDE_INT bytes_to_compare,
>> +rtx orig_src1, rtx orig_src2,
>> +rtx s1addr, rtx s2addr, rtx off_reg,
>> +rtx s1data, rtx s2data, rtx vec_result,
>> +bool equality_compare_rest, rtx *p_cleanup_label,
>> +rtx final_move_label, bool checkzero)
>> +{
>> +  machine_mode load_mode;
>> +  unsigned int load_mode_size;
>> +  unsigned HOST_WIDE_INT cmp_bytes = 0;
>> +  unsigned HOST_WIDE_INT offset = 0;
>> +  rtx zero_reg = NULL;

[PR c++/86246] ICE tsubst explicit operator call

2018-11-15 Thread Nathan Sidwell
We weren't noticing that 'expr.operator T()' is dependent if T is 
dependent.  Therefore we looked up 'operator T' at template definition 
time and went down hill from there.


86246 and 87989 had different failure modes (one iced, one generated 
wrong code), so adding both.


booted & tested on x86_64-linux, applying to trunk.

nathan
--
Nathan Sidwell
2018-11-15  Nathan Sidwell  

	PR c++/86246
	PR c++/87989
	* typeck.c (finish_class_member_access_expr): Conversion operator
	to dependent type is dependent.

	* g++.dg/template/pr86246.C: New.
	* g++.dg/template/pr87989.C: New.

Index: cp/typeck.c
===
--- cp/typeck.c	(revision 266191)
+++ cp/typeck.c	(working copy)
@@ -2884,7 +2884,12 @@ finish_class_member_access_expr (cp_expr
 	 expression is dependent.  */
 	  || (TREE_CODE (name) == SCOPE_REF
 	  && TYPE_P (TREE_OPERAND (name, 0))
-	  && dependent_scope_p (TREE_OPERAND (name, 0
+	  && dependent_scope_p (TREE_OPERAND (name, 0)))
+	  /* If NAME is operator T where "T" is dependent, we can't
+	 lookup until we instantiate the T.  */
+	  || (TREE_CODE (name) == IDENTIFIER_NODE
+	  && IDENTIFIER_CONV_OP_P (name)
+	  && dependent_type_p (TREE_TYPE (name
 	{
 	dependent:
 	  return build_min_nt_loc (UNKNOWN_LOCATION, COMPONENT_REF,
Index: testsuite/g++.dg/template/pr86246.C
===
--- testsuite/g++.dg/template/pr86246.C	(revision 0)
+++ testsuite/g++.dg/template/pr86246.C	(working copy)
@@ -0,0 +1,38 @@
+// { dg-do compile { target c++11 } }
+// PR c++/86246 ICE in tsubst
+
+namespace std {
+  template struct is_class {
+static constexpr bool value = true;
+  };
+  template<> struct is_class {
+static constexpr bool value = false;
+  };
+}
+
+class MyClass {
+ public:
+  operator double() const {
+return 1;
+  }
+  template
+  operator T() const {
+static_assert(std::is_class::value, "problem");
+return T();
+  }
+};
+
+template
+void SetValue(const MyClass& obj, T* value) {
+  //  erroneously dispatched to operator T when T is double
+  *value = obj.operator T();
+}
+
+int main() {
+  MyClass obj;
+  // works fine
+  obj.operator double ();
+  double x;
+  // error, when operator T is called in SetValue
+  SetValue(obj, );
+}
Index: testsuite/g++.dg/template/pr87989.C
===
--- testsuite/g++.dg/template/pr87989.C	(revision 0)
+++ testsuite/g++.dg/template/pr87989.C	(working copy)
@@ -0,0 +1,20 @@
+// PR c++/87989
+// { dg-do link }
+// Resolved to template instantiation rather than non-template fn.
+
+struct X {
+  template  operator T() const; // no definition
+  operator float() const {return 0.f;}
+};
+
+template 
+T f(const X ) {
+  // Resoved in error to X::operator float() const`
+  // instead of correct `X::operator float() const
+  return x.operator T();
+}
+
+int main ()
+{
+  return f(X ());
+}


Re: [PATCH] Support simd function declarations via a pre-include. (was: [PATCH][RFC]Overloading intrinsics)

2018-11-15 Thread Bernhard Reutner-Fischer
On 14 November 2018 12:35:27 CET, Jakub Jelinek  wrote:

>> --- a/gcc/config/gnu-user.h
>> +++ b/gcc/config/gnu-user.h
>> @@ -170,3 +170,6 @@ see the files COPYING3 and COPYING.RUNTIME
>respectively.  If not, see
>>LD_STATIC_OPTION " --whole-archive -llsan --no-whole-archive " \
>>LD_DYNAMIC_OPTION "}}%{!static-liblsan:-llsan}"
>>  #endif
>> +
>> +#undef TARGET_F951_NOSTDINC_OPTIONS
>> +#define TARGET_F951_NOSTDINC_OPTIONS
>"%:fortran-header-file(-fpre-include= math-vector-fortran.h)"
>
>Too long line, use some \s to split it up.
>

Can we use plain -include like in C?
This has an established meaning and is already documented.

Thanks,


Re: [PATCH] RFC: elide repeated source locations (PR other/84889)

2018-11-15 Thread Eric Gallager
On 11/15/18, Martin Sebor  wrote:
> On 11/14/2018 02:12 PM, David Malcolm wrote:
>> On Mon, 2018-11-12 at 13:37 -0700, Martin Sebor wrote:
>>> On 11/11/2018 07:43 PM, David Malcolm wrote:
 We often emit more than one diagnostic at the same source location.
 For example, the C++ frontend can emit many diagnostics at
 the same source location when suggesting overload candidates.

 For example:

 ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C: In
 function 'int test_3(s, t)':
 ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18:
 error: no match for 'operator&&' (operand types are 's' and 't')
38 |   return param_s && param_t;
   |  ^~
 ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18:
 note: candidate: 'operator&&(bool, bool)' 
 ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18:
 note:   no known conversion for argument 2 from 't' to 'bool'

 This is overly verbose.  Note how the same location has been
 printed
 three times, obscuring the pertinent messages.

 This patch add a new "elide" value to -fdiagnostics-show-location=
 and makes it the default (previously it was "once").  With elision
 the above is printed as:

 ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C: In
 function 'int test_3(s, t)':
 ../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18:
 error: no match for 'operator&&' (operand types are 's' and 't')
38 |   return param_s && param_t;
   |  ^~
   = note: candidate: 'operator&&(bool, bool)' 
   = note:   no known conversion for argument 2 from 't' to
 'bool'

 where the followup notes are printed with a '=' lined up with
 the source code margin.

 Thoughts?
>>>
>>> I agree the long pathname in the notes is at first glance redundant
>>> but I'm not sure about using '=' as a shorthand for it.

I like the -fdiagnostics-show-location=elide option but I also agree
about the '=' looking weird. However, I don't think the '=' is that
big a problem that needs to provoke huge rethinking, just some minor
bikeshedding about which symbol to use instead. How about '+'?

>>> I have
>>> written many scripts to parse GCC output to extract all diagnostics
>>> (including notes) and publish those on a Web page somewhere, as I'm
>>> sure must have others.  All those scripts would stop working with
>>> this change and require changes to the build system to work again.
>>> Making those changes can be a substantial undertaking in some
>>> organizations.
>>>
>>> Have you considered printing just the file name instead?  Or any
>>> other alternatives?
>>
>> "-fdiagnostics-show-location=once" will restore the old behavior.
>> Alternatively, if you want to parse GCC output, I'm adding a JSON
>> output format; see:
>>   https://gcc.gnu.org/ml/gcc-patches/2018-11/msg01038.html
>> (I'm testing an updated version of that patch)
>
> I understand the change can be disabled by an option.  I also
> like making the repetitive pathnames shorter, but the concern
> I have is that the change to use the '=' notation by default,
> besides being rather unusual and not necessarily intuitive, will
> break scripts that search for the pattern:
>
>"[^:]*:[1-9][0-9]*:[1-9][0-9]*: (error|note|warning): "
>
> and adding a new option to a large build system can be quite
> difficult.  I suspect it would also make the notes difficult
> or even impossible to associate with the corresponding errors
> or warnings in parallel builds (where they may be interleaved
> with diagnostics from different compilations).
>
> I think it's possible to make the output shorter/less repetitive
> without these side-effects by keeping the current format and
> instead abbreviating the pathname, e.g. by printing just the file
> name (or ".../filename.c:" with the ellipsis standing in for the
> long pathname, though the ellipsis would require adjusting
> the scripts that sort diagnostics by the pathname).
>
> Martin
>
> PS As an aside, if we wanted to shorten all pathnames this same
> idea could be extended to errors and warnings as well by printing
> the pathname in the first one and just the filename in subsequent
> ones in the same file.
>


Re: [PATCH] diagnose built-in declarations without prototype (PR 83656)

2018-11-15 Thread Martin Sebor

On 11/15/2018 03:10 AM, Richard Biener wrote:

On Wed, Nov 14, 2018 at 7:08 PM Jeff Law  wrote:


On 11/6/18 4:56 PM, Martin Sebor wrote:

In response to Joseph's comment I've removed the interaction
with -Wpedantic from the updated patch.

In addition, to help detect bugs like the one in the test case
for pr87886, I have also enhanced the detection of incompatible
calls to include integer/real type mismatches so that calls like
the one below are diagnosed:

  extern double sqrt ();
  int f (int x)
  {
return sqrt (x);   // passing int where double is expected
  }

With the removal of the -Wpedantic interaction declaring abort()
without a prototype is no longer diagnosed and so the test suite
changes to add the prototype are not necessary.  I decided not
to back them out because Jeff indicated a preference for making
these kinds of improvements in general in an unrelated
discussion.



gcc-83656.diff

PR c/83656 - missing -Wbuiltin-declaration-mismatch on declaration without 
prototype

gcc/c/ChangeLog:

  PR c/83656
  * c-decl.c (header_for_builtin_fn): Declare.
  (diagnose_mismatched_decls): Diagnose declarations of built-in
  functions without a prototype.
  * c-typeck.c (maybe_warn_builtin_no_proto_arg): New function.
  (convert_argument): Same.
  (convert_arguments): Factor code out into convert_argument.
  Detect mismatches between built-in formal arguments in calls
  to built-in without prototype.
  (build_conditional_expr): Same.
  (type_or_builtin_type): New function.
  (convert_for_assignment): Add argument.  Conditionally issue
  warnings instead of errors for mismatches.

gcc/testsuite/ChangeLog:

  PR c/83656
  * gcc.dg/20021006-1.c
  * gcc.dg/Wbuiltin-declaration-mismatch.c: New test.
  * gcc.dg/Wbuiltin-declaration-mismatch-2.c: New test.
  * gcc.dg/Wbuiltin-declaration-mismatch-3.c: New test.
  * gcc.dg/Wbuiltin-declaration-mismatch-4.c: New test.
  * gcc.dg/Walloca-16.c: Adjust.
  * gcc.dg/Wrestrict-4.c: Adjust.
  * gcc.dg/Wrestrict-5.c: Adjust.
  * gcc.dg/atomic/stdatomic-generic.c: Adjust.
  * gcc.dg/atomic/stdatomic-lockfree.c: Adjust.
  * gcc.dg/initpri1.c: Adjust.
  * gcc.dg/pr15698-1.c: Adjust.
  * gcc.dg/pr69156.c: Adjust.
  * gcc.dg/pr83463.c: Adjust.
  * gcc.dg/redecl-4.c: Adjust.
  * gcc.dg/tls/thr-init-2.c: Adjust.
  * gcc.dg/torture/pr55890-2.c: Adjust.
  * gcc.dg/torture/pr55890-3.c: Adjust.
  * gcc.dg/torture/pr67741.c: Adjust.
  * gcc.dg/torture/stackalign/sibcall-1.c: Adjust.
  * gcc.dg/torture/tls/thr-init-1.c: Adjust.
  * gcc.dg/tree-ssa/builtins-folding-gimple-ub.c: Adjust.





@@ -3547,8 +3598,24 @@ convert_arguments (location_t loc, vec 
arg_loc, tree typelist,
   if (parmval == error_mark_node)
  error_args = true;

+  if (!type && builtin_type && TREE_CODE (builtin_type) != VOID_TYPE)
+ {
+   /* For a call to a built-in function declared without a prototype,
+  perform the coversions from the argument to the expected type
+  but issue warnings rather than errors for any mismatches.
+  Ignore the converted argument and use the PARMVAL obtained
+  above by applying default coversions instead.  */

s/coversions/conversions/

Two of 'em in that comment.  OK with that nit fixed.


I once had -Wunprototyped-calls...  (attached).  Because the issue doesn't
exist only for builtins... (originally inspired by a Xorg bug)


You're right, thanks for the suggestion.  I'm still planning to
resubmit the patch I posted last June to enable -Wstrict-prototypes
(PR 82922).  Detecting this problem there would be a good enhancement.
If it's too late to do this for GCC 9 I will submit the patch for GCC
10.

Martin


Re: [PATCH] avoid -Wnonnull for printf format in dead code (PR 87041)

2018-11-15 Thread Martin Sebor

On 11/15/2018 02:39 AM, Matthew Malcomson wrote:

On 02/11/18 09:54, Christophe Lyon wrote:

Hi,

I've noticed failure on targets using newlib (aarch64-elf and arm-eabi):
FAIL: gcc.c-torture/execute/printf-2.c
FAIL: gcc.c-torture/execute/user-printf.c

my gcc.log contains:
gcc.c-torture/execute/user-printf.c   -O0  execution test (reason: TCL
LOOKUP CHANNEL exp5)
which is not very helpful



@Christophe
Can I ask if your DejaGNU board setup has "needs_status_wrapper 1" for
both of these boards?

I believe this problem is caused by an interaction with the DejaGNU
status wrapper.

When the status wrapper is needed, DejaGNU looks at stdout for a line
saying "*** EXIT code " indicating what the status code was.
When it doesn't find that line it assumes an exit code of 2.
Without the status wrapper DejaGNU takes the return code from the
program executed.

Because these tests use "freopen ()" on stdout, the status wrapper fails
to print to the IO channel DejaGNU is listening on, hence DejaGNU fails
to find it's line, uses an exit code of 2, and fails the test.


@Martin
Would these tests still be valid having run freopen on stderr and using
fprintf instead of printf?
That makes the testcases pass for me.


The printf-2.c test specifically exercises the printf built-in,
i.e., not fprintf, so changing it to use fprintf would defeat
its purpose.

user-printf.c does something similar but for a user-defined
function with attribute format (printf).

The purpose of the tests is to verify that what they read from
the temporary file matches what they wrote (i.e,, that none of
the printf() or user_print() calls was incorrectly eliminated).


If not we could add an
{ dg-require-effective-target unwrapped }
directive in the testcases to stop the failure complaints.


I'm not familiar with this directive or really know what
a status wrapper is but as long as it doesn't change the I/O
the test does I think it should be fine.

Martin


Re: [PING^2] Re: [PATCH 1/3] Support instrumenting returns of instrumented functions

2018-11-15 Thread Andi Kleen
Andi Kleen  writes:

Ping!^2

> Andi Kleen  writes:
>
> Ping!
>
>> From: Andi Kleen 
>>
>> When instrumenting programs using __fentry__ it is often useful
>> to instrument the function return too. Traditionally this
>> has been done by patching the return address on the stack
>> frame on entry. However this is fairly complicated (trace
>> function has to emulate a stack) and also slow because
>> it causes a branch misprediction on every return.
>>
>> Add an option to generate call or nop instrumentation for
>> every return instead, including patch sections.
>>
>> This will increase the program size slightly, but can be a
>> lot faster and simpler.
>>
>> This version only instruments true returns, not sibling
>> calls or tail recursion. This matches the semantics of the
>> original stack.
>>
>> gcc/:
>>
>> 2018-11-04  Andi Kleen  
>>
>>  * config/i386/i386-opts.h (enum instrument_return): Add.
>>  * config/i386/i386.c (output_return_instrumentation): Add.
>>  (ix86_output_function_return): Call output_return_instrumentation.
>>  (ix86_output_call_insn): Call output_return_instrumentation.
>>  * config/i386/i386.opt: Add -minstrument-return=.
>>  * doc/invoke.texi (-minstrument-return): Document.
>>
>> gcc/testsuite/:
>>
>> 2018-11-04  Andi Kleen  
>>
>>  * gcc.target/i386/returninst1.c: New test.
>>  * gcc.target/i386/returninst2.c: New test.
>>  * gcc.target/i386/returninst3.c: New test.
>> ---
>>  gcc/config/i386/i386-opts.h |  6 
>>  gcc/config/i386/i386.c  | 36 +
>>  gcc/config/i386/i386.opt| 21 
>>  gcc/doc/invoke.texi | 14 
>>  gcc/testsuite/gcc.target/i386/returninst1.c | 14 
>>  gcc/testsuite/gcc.target/i386/returninst2.c | 21 
>>  gcc/testsuite/gcc.target/i386/returninst3.c |  9 ++
>>  7 files changed, 121 insertions(+)
>>  create mode 100644 gcc/testsuite/gcc.target/i386/returninst1.c
>>  create mode 100644 gcc/testsuite/gcc.target/i386/returninst2.c
>>  create mode 100644 gcc/testsuite/gcc.target/i386/returninst3.c
>>
>> diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h
>> index 46366cbfa72..35e9413100e 100644
>> --- a/gcc/config/i386/i386-opts.h
>> +++ b/gcc/config/i386/i386-opts.h
>> @@ -119,4 +119,10 @@ enum indirect_branch {
>>indirect_branch_thunk_extern
>>  };
>>  
>> +enum instrument_return {
>> +  instrument_return_none = 0,
>> +  instrument_return_call,
>> +  instrument_return_nop5
>> +};
>> +
>>  #endif
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index f9ef0b4445b..f7cd94a8139 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -28336,12 +28336,47 @@ ix86_output_indirect_jmp (rtx call_op)
>>  return "%!jmp\t%A0";
>>  }
>>  
>> +/* Output return instrumentation for current function if needed.  */
>> +
>> +static void
>> +output_return_instrumentation (void)
>> +{
>> +  if (ix86_instrument_return != instrument_return_none
>> +  && flag_fentry
>> +  && !DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT (cfun->decl))
>> +{
>> +  if (ix86_flag_record_return)
>> +fprintf (asm_out_file, "1:\n");
>> +  switch (ix86_instrument_return)
>> +{
>> +case instrument_return_call:
>> +  fprintf (asm_out_file, "\tcall\t__return__\n");
>> +  break;
>> +case instrument_return_nop5:
>> +  /* 5 byte nop: nopl 0(%[re]ax,%[re]ax,1)  */
>> +  fprintf (asm_out_file, ASM_BYTE "0x0f, 0x1f, 0x44, 0x00, 0x00\n");
>> +  break;
>> +case instrument_return_none:
>> +  break;
>> +}
>> +
>> +  if (ix86_flag_record_return)
>> +{
>> +  fprintf (asm_out_file, "\t.section __return_loc, \"a\",@progbits\n");
>> +  fprintf (asm_out_file, "\t.%s 1b\n", TARGET_64BIT ? "quad" : "long");
>> +  fprintf (asm_out_file, "\t.previous\n");
>> +}
>> +}
>> +}
>> +
>>  /* Output function return.  CALL_OP is the jump target.  Add a REP
>> prefix to RET if LONG_P is true and function return is kept.  */
>>  
>>  const char *
>>  ix86_output_function_return (bool long_p)
>>  {
>> +  output_return_instrumentation ();
>> +
>>if (cfun->machine->function_return_type != indirect_branch_keep)
>>  {
>>char thunk_name[32];
>> @@ -28454,6 +28489,7 @@ ix86_output_call_insn (rtx_insn *insn, rtx call_op)
>>  
>>if (SIBLING_CALL_P (insn))
>>  {
>> +  output_return_instrumentation ();
>>if (direct_p)
>>  {
>>if (ix86_nopic_noplt_attribute_p (call_op))
>> diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
>> index e7fbf9b6f99..5925b75244f 100644
>> --- a/gcc/config/i386/i386.opt
>> +++ b/gcc/config/i386/i386.opt
>> @@ -1063,3 +1063,24 @@ Support WAITPKG built-in functions and code 
>> generation.
>>  mcldemote
>>  Target Report Mask(ISA_CLDEMOTE) Var(ix86_isa_flags2) Save
>>  Support CLDEMOTE built-in functions and code generation.
>> +
>> 

Re: [PATCH] diagnose unsupported uses of hardware register variables (PR 88000)

2018-11-15 Thread Martin Sebor

On 11/14/2018 02:47 AM, Jakub Jelinek wrote:

On Tue, Nov 13, 2018 at 09:11:41PM -0700, Martin Sebor wrote:

--- gcc/c-family/c-warn.c   (revision 266086)
+++ gcc/c-family/c-warn.c   (working copy)
@@ -2609,3 +2609,39 @@ warn_for_multistatement_macros (location_t body_lo
 inform (guard_loc, "some parts of macro expansion are not guarded by "
"this %qs clause", guard_tinfo_to_string (keyword));
 }
+
+
+/* Diagnose unsuported use of explicit hardware register variable ARG
+   as an argument ARGNO to function FNDECL.  */
+
+void
+warn_hw_reg_arg (tree fndecl, int argno, tree arg)
+{
+  if (!fndecl)
+return;
+
+  /* Avoid diagnosing GCC intrinsics with no library fallbacks.  */
+  if (fndecl_built_in_p (fndecl)
+  && DECL_IS_BUILTIN (fndecl)
+  && !c_decl_implicit (fndecl)
+  && !DECL_ASSEMBLER_NAME_SET_P (fndecl))
+return;
+
+  /* Also avoid diagnosing always_inline functions since those are
+ often used to implement vectorization intrinsics that make use
+ of hardware register variables.  */
+  if (lookup_attribute ("always_inline", DECL_ATTRIBUTES (fndecl)))
+return;
+
+  /* Diagnose uses of local variables declared asm register.  */
+  STRIP_ANY_LOCATION_WRAPPER (arg);
+  if (VAR_P (arg)
+  && !TREE_STATIC (arg)
+  && DECL_HARD_REGISTER (arg)
+  && warning_at (input_location, OPT_Wasm_register_var,
+"unsupported use of hardware register %qD as "
+"argument %d of %qD",
+arg, argno, fndecl))
+inform (DECL_SOURCE_LOCATION (arg),
+   "%qD declared here", arg);
+}


This makes no sense to me.  There is nothing unsupported in passing
a local hard register variable to a function, that is well defined,


PR 88000 was resolved as invalid quoting the following sentence
from the manual as the rationale:

  The only supported use for this feature is to specify registers
  for input and output operands when calling Extended asm.

If these variables are meant to be supported in other contexts
(such as in calls to built-ins or even user-defined functions)
the manual needs to be clarified.


and as your testcase changes show, you broke quite some completely valid
testcases with that.


Let's not be melodramatic.  Nothing was "broken" by a proposed
warning.  But I did say that "if using explicit register vars in
those functions is meant to be supported despite what the manual
says the patch will need tweaking (as will the manual)."


What doesn't work as the reporter expect is assumption that local hard
register variables that live across function calls must have their values
preserved; they can be modified by the callees.


I'm not sure I understand how what you described as supported
is different from what the test case in 88000 does: i.e., pass
the value of a register variable to a function and expect its
value to be unchanged.  If the value happens to be preserved
in some cases/for some registers but not for others, unless
the two sets can be reliably distinguished it seems like a good
candidate for a warning, to help users fall into the same trap.
Users who know what they're doing can easily suppress the warning
and others will fix their code.  Is there a way to do distinguish
the two sets of cases?

Martin


Re: [PATCH 1/7][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf

2018-11-15 Thread Jozef Lawrynowicz
On Thu, 15 Nov 2018 09:48:05 -0500
Paul Koning  wrote:

> > On Nov 14, 2018, at 5:19 PM, Jozef Lawrynowicz 
> > wrote:
> > 
> > 
> > I'd be curious if the line I added the xfail to in c-c++-common/pr57371-2.c
> > also fails for pdp11.
> > 
> > The conversion to float might be getting optimized out whenever
> > sizeof(int) < sizeof(float).
> > 
> > Thanks,
> > Jozef  
> 
> Yes, that test on pr57371-2.c also fails on pdp11.
> 
>   paul
> 

Thanks for checking, in that case I'll go ahead an add an effective target for
"int_lt_float".

I'll make a note to investigate that test failure as well. The test
comments:
> We can not get rid of comparison in tests below because of
> potential inexact exception.
If I'm understanding the test correctly, then if the cast to float has been
optimized out, users expecting the inexact float exception to be raised will
have unexpected behaviour.

Jozef


Re: [PATCH] RFC: elide repeated source locations (PR other/84889)

2018-11-15 Thread Martin Sebor

On 11/14/2018 02:12 PM, David Malcolm wrote:

On Mon, 2018-11-12 at 13:37 -0700, Martin Sebor wrote:

On 11/11/2018 07:43 PM, David Malcolm wrote:

We often emit more than one diagnostic at the same source location.
For example, the C++ frontend can emit many diagnostics at
the same source location when suggesting overload candidates.

For example:

../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C: In
function 'int test_3(s, t)':
../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18:
error: no match for 'operator&&' (operand types are 's' and 't')
   38 |   return param_s && param_t;
  |  ^~
../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18:
note: candidate: 'operator&&(bool, bool)' 
../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18:
note:   no known conversion for argument 2 from 't' to 'bool'

This is overly verbose.  Note how the same location has been
printed
three times, obscuring the pertinent messages.

This patch add a new "elide" value to -fdiagnostics-show-location=
and makes it the default (previously it was "once").  With elision
the above is printed as:

../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C: In
function 'int test_3(s, t)':
../../src/gcc/testsuite/g++.dg/diagnostic/bad-binary-ops.C:38:18:
error: no match for 'operator&&' (operand types are 's' and 't')
   38 |   return param_s && param_t;
  |  ^~
  = note: candidate: 'operator&&(bool, bool)' 
  = note:   no known conversion for argument 2 from 't' to
'bool'

where the followup notes are printed with a '=' lined up with
the source code margin.

Thoughts?


I agree the long pathname in the notes is at first glance redundant
but I'm not sure about using '=' as a shorthand for it.  I have
written many scripts to parse GCC output to extract all diagnostics
(including notes) and publish those on a Web page somewhere, as I'm
sure must have others.  All those scripts would stop working with
this change and require changes to the build system to work again.
Making those changes can be a substantial undertaking in some
organizations.

Have you considered printing just the file name instead?  Or any
other alternatives?


"-fdiagnostics-show-location=once" will restore the old behavior.
Alternatively, if you want to parse GCC output, I'm adding a JSON
output format; see:
  https://gcc.gnu.org/ml/gcc-patches/2018-11/msg01038.html
(I'm testing an updated version of that patch)


I understand the change can be disabled by an option.  I also
like making the repetitive pathnames shorter, but the concern
I have is that the change to use the '=' notation by default,
besides being rather unusual and not necessarily intuitive, will
break scripts that search for the pattern:

  "[^:]*:[1-9][0-9]*:[1-9][0-9]*: (error|note|warning): "

and adding a new option to a large build system can be quite
difficult.  I suspect it would also make the notes difficult
or even impossible to associate with the corresponding errors
or warnings in parallel builds (where they may be interleaved
with diagnostics from different compilations).

I think it's possible to make the output shorter/less repetitive
without these side-effects by keeping the current format and
instead abbreviating the pathname, e.g. by printing just the file
name (or ".../filename.c:" with the ellipsis standing in for the
long pathname, though the ellipsis would require adjusting
the scripts that sort diagnostics by the pathname).

Martin

PS As an aside, if we wanted to shorten all pathnames this same
idea could be extended to errors and warnings as well by printing
the pathname in the first one and just the filename in subsequent
ones in the same file.


Re: [PATCH 01/25] Handle vectors that don't fit in an integer.

2018-11-15 Thread Andrew Stubbs

On 14/09/2018 16:37, Richard Sandiford wrote:

diff --git a/gcc/combine.c b/gcc/combine.c
index a2649b6..cbf9dae 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -8621,7 +8621,13 @@ gen_lowpart_or_truncate (machine_mode mode, rtx x)
  {
/* Bit-cast X into an integer mode.  */
if (!SCALAR_INT_MODE_P (GET_MODE (x)))
-   x = gen_lowpart (int_mode_for_mode (GET_MODE (x)).require (), x);
+   {
+ enum machine_mode imode =
+   int_mode_for_mode (GET_MODE (x)).require ();
+ if (imode == BLKmode)
+   return gen_rtx_CLOBBER (mode, const0_rtx);
+ x = gen_lowpart (imode, x);


require () will ICE if there isn't an integer mode and always returns
a scalar_int_mode, so this looks like a no-op.  I think you want
something like:


This is a patch that I inherited from Honza and Martin and didn't know 
what testcase it fixed.


I think that it being broken shows that it's no longer necessary, and 
reverting the patch and retesting confirms this suspicion.


I've removed it from the patch.


@@ -11698,6 +11704,11 @@ gen_lowpart_for_combine (machine_mode omode, rtx x)
if (omode == imode)
  return x;
  
+  /* This can happen when there is no integer mode corresponding

+ to a size of vector mode.  */
+  if (omode == BLKmode)
+goto fail;
+
/* We can only support MODE being wider than a word if X is a
   constant integer or has a mode the same size.  */
if (maybe_gt (GET_MODE_SIZE (omode), UNITS_PER_WORD)


This seems like it's working around a bug in ther caller.


Again, I inherited this hunk. Removing it and retesting shows no 
regressions, so I'm dropping it.



diff --git a/gcc/expr.c b/gcc/expr.c
index cd5cf12..776254a 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -10569,6 +10569,14 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode 
tmode,
  || maybe_gt (bitpos + bitsize,
   GET_MODE_BITSIZE (mode2)));
  
+	/* If the result is in BLKmode and the underlying object is a

+  vector in a register, and the size of the vector is larger than
+  the largest integer mode, then we must force OP0 to be in memory
+  as this is assumed in later code.  */
+   if (REG_P (op0) && VECTOR_MODE_P (mode2) && mode == BLKmode
+   && maybe_gt (bitsize, MAX_FIXED_MODE_SIZE))
+ must_force_mem = 1;
+
/* Handle CONCAT first.  */
if (GET_CODE (op0) == CONCAT && !must_force_mem)
  {


Are you sure this is still needed after:

2018-06-04  Richard Sandiford  

* expr.c (expand_expr_real_1): Force the operand into memory if
its TYPE_MODE is BLKmode and if there is no integer mode for
the number of bits being extracted.


Apparently you're right about this. Hunk dropped.


diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 8d94fca..607a2bd 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -6702,12 +6702,12 @@ vectorizable_store (stmt_vec_info stmt_info, 
gimple_stmt_iterator *gsi,
 supported.  */
  unsigned lsize
= group_size * GET_MODE_BITSIZE (elmode);
- elmode = int_mode_for_size (lsize, 0).require ();
  unsigned int lnunits = const_nunits / group_size;
  /* If we can't construct such a vector fall back to
 element extracts from the original vector type and
 element size stores.  */
- if (mode_for_vector (elmode, lnunits).exists ()
+ if (int_mode_for_size (lsize, 0).exists ()
+ && mode_for_vector (elmode, lnunits).exists ()
  && VECTOR_MODE_P (vmode)
  && targetm.vector_mode_supported_p (vmode)
  && (convert_optab_handler (vec_extract_optab,
@@ -7839,11 +7839,11 @@ vectorizable_load (stmt_vec_info stmt_info, 
gimple_stmt_iterator *gsi,
 to a larger load.  */
  unsigned lsize
= group_size * TYPE_PRECISION (TREE_TYPE (vectype));
- elmode = int_mode_for_size (lsize, 0).require ();
  unsigned int lnunits = const_nunits / group_size;
  /* If we can't construct such a vector fall back to
 element loads of the original vector type.  */
- if (mode_for_vector (elmode, lnunits).exists ()
+ if (int_mode_for_size (lsize, 0).exists ()
+ && mode_for_vector (elmode, lnunits).exists ()
  && VECTOR_MODE_P (vmode)
  && targetm.vector_mode_supported_p (vmode)
  && (convert_optab_handler (vec_init_optab, vmode, elmode)


These two are OK independently of the rest (if that's convenient).


Thanks, I've committed the attached. These are the only parts of the 
patch that remain. I've confirmed that there are 

Re: [C++ Patch] Fix ensure_literal_type_for_constexpr_object locations

2018-11-15 Thread Jason Merrill
OK.
On Thu, Nov 15, 2018 at 8:39 AM Paolo Carlini  wrote:
>
> Hi,
>
> this one should be straightforward. Tested x86_64-linux.
>
> Thanks, Paolo.
>
> /
>


Re: [PATCH] diagnose unsupported uses of hardware register variables (PR 88000)

2018-11-15 Thread Alexander Monakov
On Thu, 15 Nov 2018, Michael Matz wrote:
> I disagree that there's something to fix.  My mental model for local reg 
> vars has always been that such vars are actually an alias for that 
> register, not comparable to normal automatic variables (so, much like 
> global reg vars, except that they don't reserve away the register from 
> regalloc).  I.e. like volatile they can arbitrarily change their value.  
> I don't know if other peoples mind model is similar, but it certainly is 
> the model that is implemented in the GIMPLE pipeline (and if memory serves 
> well of the RTL pipeline as well).

Reading the documentation certainly does not make that impression to me.
In any case, can you elaborate a bit further please:

1. Regarding the comparison to 'volatile' qualifier.  Suppose you have an
automatic variable 'int v;' in a correct program.  The variable is only used
for some arithmetic, never passed to asms, does not have its address taken.

Thus, changing its
declaration to 'volatile int v;' would not make the program invalid.  Now,
can declaring it as 'register int v asm("%rbx");' (with a callee-saved
register) make the program invalid?  My reading of the documentation is
that it would provide a regalloc hint and have no ill effects.


2. Are testcases given in PR 87984 valid? Quoting the latest example:

int f(void)
{
int o=0, i;
for (i=0; i<3; i++) {
register int a asm("eax");
a = 1;
asm("add %1, %0" : "+r"(o) : "r"(a));
asm("xor %%eax, %%eax" ::: "eax");
}
return o;
}

This follows both your model and the documentation to the letter, and
yet will return 1 rather than 3.

I disagree that it is practical to implement your model on GIMPLE.

Thanks.
Alexander


[PATCH][libbacktrace] Factor out read_string

2018-11-15 Thread Tom de Vries
Hi,

This patch factors out new function read_string in dwarf.c.

Bootstrapped and reg-tested on x86_64.

OK for trunk (or, for stage1)?

Thanks,
- Tom

[libbacktrace] Factor out read_string

2018-11-15  Tom de Vries  

* dwarf.c (read_string): Factor out of ...
(read_attribute, read_line_header, read_line_program): ... here.

---
 libbacktrace/dwarf.c | 39 ---
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c
index 4566d37cf2f..c4f8732c7eb 100644
--- a/libbacktrace/dwarf.c
+++ b/libbacktrace/dwarf.c
@@ -411,6 +411,25 @@ advance (struct dwarf_buf *buf, size_t count)
   return 1;
 }
 
+/* Read one zero-terminated string from BUF and advance past the string.  */
+
+static const char *
+read_string (struct dwarf_buf *buf)
+{
+  const char *p = (const char *)buf->buf;
+  size_t len = strnlen (p, buf->left);
+
+  /* - If len == left, we ran out of buffer before finding the zero terminator.
+   Generate an error by advancing len + 1.
+ - If len < left, advance by len + 1 to skip past the zero terminator.  */
+  size_t count = len + 1;
+
+  if (!advance (buf, count))
+return NULL;
+
+  return p;
+}
+
 /* Read one byte from BUF and advance 1 byte.  */
 
 static unsigned char
@@ -694,8 +713,8 @@ read_attribute (enum dwarf_form form, struct dwarf_buf *buf,
   return 1;
 case DW_FORM_string:
   val->encoding = ATTR_VAL_STRING;
-  val->u.string = (const char *) buf->buf;
-  return advance (buf, strnlen ((const char *) buf->buf, buf->left) + 1);
+  val->u.string = read_string (buf);
+  return val->u.string == NULL ? 0 : 1;
 case DW_FORM_block:
   val->encoding = ATTR_VAL_BLOCK;
   return advance (buf, read_uleb128 (buf));
@@ -1649,11 +1668,10 @@ read_line_header (struct backtrace_state *state, struct 
unit *u,
   if (hdr_buf.reported_underflow)
return 0;
 
-  hdr->dirs[i] = (const char *) hdr_buf.buf;
-  ++i;
-  if (!advance (_buf,
-   strnlen ((const char *) hdr_buf.buf, hdr_buf.left) + 1))
+  hdr->dirs[i] = read_string (_buf);
+  if (hdr->dirs[i] == NULL)
return 0;
+  ++i;
 }
   if (!advance (_buf, 1))
 return 0;
@@ -1687,9 +1705,8 @@ read_line_header (struct backtrace_state *state, struct 
unit *u,
   if (hdr_buf.reported_underflow)
return 0;
 
-  filename = (const char *) hdr_buf.buf;
-  if (!advance (_buf,
-   strnlen ((const char *) hdr_buf.buf, hdr_buf.left) + 1))
+  filename = read_string (_buf);
+  if (filename == NULL)
return 0;
   dir_index = read_uleb128 (_buf);
   if (IS_ABSOLUTE_PATH (filename)
@@ -1808,8 +1825,8 @@ read_line_program (struct backtrace_state *state, struct 
dwarf_data *ddata,
const char *f;
unsigned int dir_index;
 
-   f = (const char *) line_buf->buf;
-   if (!advance (line_buf, strnlen (f, line_buf->left) + 1))
+   f = read_string (line_buf);
+   if (f == NULL)
  return 0;
dir_index = read_uleb128 (line_buf);
/* Ignore that time and length.  */


Re: Bug 52869 - [DR 1207] "this" not being allowed in noexcept clauses

2018-11-15 Thread Marek Polacek
On Thu, Nov 15, 2018 at 10:56:15AM -0500, Marek Polacek wrote:
> +   if (current_class_type)
> +  inject_this_parameter (current_class_type, TYPE_UNQUALIFIED);
>  
> I think you can remove the if here.

Actually you probably need it after all.

Marek


Re: Bug 52869 - [DR 1207] "this" not being allowed in noexcept clauses

2018-11-15 Thread Marek Polacek
On Thu, Nov 15, 2018 at 02:26:24PM +0530, Umesh Kalappa wrote:
> Thank you Marek  for the inputs .
> >>In the future, if using diff, please also use the -p option.
> We are using svn diif  and other comments are addressed .

Thanks, but it doesn't seem like the -p option was used.

> please let us know  your take on the  revised attached patch .


Index: cp/ChangeLog
===
--- cp/ChangeLog(revision 266026)
+++ cp/ChangeLog(working copy)
@@ -1,3 +1,9 @@
+2018-11-14  Kamlesh Kumar  
+
+   PR c++/52869
+   *parser.c () :  restore the old current_class_{ptr,ref} by
+   inject_this_parameter().
+

This is still the same; can you adjust it according to my last suggestion?

Index: cp/parser.c
===
--- cp/parser.c (revision 266026)
+++ cp/parser.c (working copy)
@@ -24620,6 +24620,12 @@
{
  matching_parens parens;
  parens.consume_open (parser);
+ 
+ tree save_ccp = current_class_ptr;
+ tree save_ccr = current_class_ref;
+ 

Watch out for trailing whitespace in the blank lines.

+ if (current_class_type)
+  inject_this_parameter (current_class_type, TYPE_UNQUALIFIED);
 
I think you can remove the if here.

  if (require_constexpr)
{
@@ -24640,6 +24646,9 @@
}
 
  parens.require_close (parser);
+
+ save_ccp = current_class_ptr = save_ccp;
+ save_ccr = current_class_ref = save_ccr;

You don't need to set save_cc[pr] to itself here.

Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog (revision 266026)
+++ testsuite/ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2018-11-14  Kamlesh Kumar  
+
+   PR c++/52869
+   * g++.dg//DRs/dr52869.C: New.
+

So DR != PR.  Please name the test dr1207.C

Index: testsuite/g++.dg/DRs/dr52869.C
===
--- testsuite/g++.dg/DRs/dr52869.C  (nonexistent)
+++ testsuite/g++.dg/DRs/dr52869.C  (working copy)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -std=c++11" } */

Instead of this, do:

// { dg-do compile { target c++11 } }

Also, I wrote a test that fails if current_class_{ptr,ref} aren't properly
restored:

// DR 1207
// PR c++/52869
// { dg-do compile { target c++11 } }

void
fn ()
{
  struct S {
bool operator!() noexcept(false);
  } s;
  S t = s;
}

So you can add that one too, e.g. testsuite/g++.dg/DRs/dr1207-2.C.

Thanks,
Marek


Re: [PATCH] diagnose unsupported uses of hardware register variables (PR 88000)

2018-11-15 Thread Michael Matz
Hi,

On Wed, 14 Nov 2018, Alexander Monakov wrote:

> On Wed, 14 Nov 2018, Segher Boessenkool wrote:
> > Yeah, using local register vars to access global registers only works
> > by accident.  It does work currently though, and people apparently use
> > it, so we shouldn't break it :-/
> 
> In the proposed approach (copying from/to pseudos just before/after the
> asm) we can emulate historic behavior by making uninitialized pseudos
> take values of the corresponding hardregs. If we decide that doing that
> just for must-uninit pseudos is enough, it's a simple extension to the
> existing init-regs pass.  Does this sound reasonable?

Or we can just decide that nothing needs fixing.  In particular about this 
from Jakub and Segher:

> > What doesn't work as the reporter expect is assumption that local hard
> > register variables that live across function calls must have their 
> > values preserved; they can be modified by the callees.
> 
> It would be really nice if we could fix that :-)

I disagree that there's something to fix.  My mental model for local reg 
vars has always been that such vars are actually an alias for that 
register, not comparable to normal automatic variables (so, much like 
global reg vars, except that they don't reserve away the register from 
regalloc).  I.e. like volatile they can arbitrarily change their value.  
I don't know if other peoples mind model is similar, but it certainly is 
the model that is implemented in the GIMPLE pipeline (and if memory serves 
well of the RTL pipeline as well).

Copying outof/into pseudos around asms serves no purpose with that model, 
it rather makes regvars somewhat "safer" to use, without actually making 
them safer (there _is_ nothing safe about them).


Ciao,
Michael.


[C++ DR 2336] virtual dtors, exception specs & abstract classes

2018-11-15 Thread Nathan Sidwell

This patch implements dr2336, a fix to dr1658

dr1658 made virtual bases of abstract classes ignored for synthesize 
cdtors -- there are no complete objects of such type, so the vbases will 
never be cdtored by such cdtors.


Except when virtual dtors are in play, such a dtor could override a 
virtual dtor in a virtual base, and thus must not have a stricter 
exception specification.  DR2336 addresses that.


I had nearly anticipated this resolution, but I was also looking into 
virtual bases in other cases, and ignoring access in different other 
cases.   This implements the DR2336 wording (it does not distinguish 
between virtual and non-virtual dtors of virtual bases).


booted on x6_64-linux, committing to trunk.

nathan
--
Nathan Sidwell
2018-11-15  Nathan Sidwell  

	DR 2336
	* cp-tree.h (enum special_function_kind): Add sfk_virtual_destructor.
	* method.c (type_has_trivial_fn): Add it.
	(SFK_DTOR_P): Likewise.
	(synthesized_method_base_walk): Don't check access of vbases of
	abstract classes when sfk_virtual_destructor.
	(synthesized_method_walk): Skip vbases of abstract classes except
	when sfk_virtual_destructor.
	(get_defaulted_eh_spec): Set sfk_virtual_destructor as needed.

	* g++.dg/cpp1y/pr79393-3.C: New.

Index: gcc/cp/cp-tree.h
===
--- gcc/cp/cp-tree.h	(revision 266180)
+++ gcc/cp/cp-tree.h	(working copy)
@@ -5099,7 +5099,8 @@ enum special_function_kind {
 			  deletes the object after it has been
 			  destroyed.  */
   sfk_conversion,	   /* A conversion operator.  */
-  sfk_deduction_guide	   /* A class template deduction guide.  */
+  sfk_deduction_guide,	   /* A class template deduction guide.  */
+  sfk_virtual_destructor   /* Used by member synthesis fns.  */
 };
 
 /* The various kinds of linkage.  From [basic.link],
Index: gcc/cp/method.c
===
--- gcc/cp/method.c	(revision 266180)
+++ gcc/cp/method.c	(working copy)
@@ -402,6 +402,7 @@ type_has_trivial_fn (tree ctype, special
 case sfk_move_assignment:
   return !TYPE_HAS_COMPLEX_MOVE_ASSIGN (ctype);
 case sfk_destructor:
+case sfk_virtual_destructor:
   return !TYPE_HAS_NONTRIVIAL_DESTRUCTOR (ctype);
 case sfk_inheriting_constructor:
   return false;
@@ -1287,7 +1288,7 @@ process_subob_fn (tree fn, tree *spec_p,
 #define SFK_CTOR_P(sfk) \
   ((sfk) >= sfk_constructor && (sfk) <= sfk_move_constructor)
 #define SFK_DTOR_P(sfk) \
-  ((sfk) == sfk_destructor)
+  ((sfk) == sfk_destructor || (sfk) == sfk_virtual_destructor)
 #define SFK_ASSIGN_P(sfk) \
   ((sfk) == sfk_copy_assignment || (sfk) == sfk_move_assignment)
 #define SFK_COPY_P(sfk) \
@@ -1481,12 +1482,11 @@ synthesized_method_base_walk (tree binfo
   if (flag_new_inheriting_ctors)
 	defer = dk_deferred;
 }
-  /* To be conservative, ignore access to the base dtor that
- DR1658 instructs us to ignore.  See the comment in
- synthesized_method_walk.  */
-  else if (cxx_dialect >= cxx14 && fnname == complete_dtor_identifier
+  else if (cxx_dialect >= cxx14 && sfk == sfk_virtual_destructor
 	   && BINFO_VIRTUAL_P (base_binfo)
 	   && ABSTRACT_CLASS_TYPE_P (BINFO_TYPE (binfo)))
+/* Don't check access when looking at vbases of abstract class's
+   virtual destructor.  */
 defer = dk_no_check;
 
   if (defer != dk_no_deferred)
@@ -1572,7 +1572,7 @@ synthesized_method_walk (tree ctype, spe
   bool check_vdtor = false;
   tree fnname;
 
-if (SFK_DTOR_P (sfk))
+  if (SFK_DTOR_P (sfk))
 {
   check_vdtor = true;
   /* The synthesized method will call base dtors, but check complete
@@ -1696,12 +1696,11 @@ if (SFK_DTOR_P (sfk))
   else if (vec_safe_is_empty (vbases))
 /* No virtual bases to worry about.  */;
   else if (ABSTRACT_CLASS_TYPE_P (ctype) && cxx_dialect >= cxx14
-	   /* DR 1658 specifies that vbases of abstract classes are
-	  ignored for both ctors and dtors.  However, that breaks
-	  virtual dtor overriding when the ignored base has a
-	  throwing destructor.  So, ignore that piece of 1658.  A
-	  defect has been filed (no number yet).  */
-	   && sfk != sfk_destructor)
+	   /* DR 1658 specifis that vbases of abstract classes are
+	  ignored for both ctors and dtors.  Except DR 2338
+	  overrides that skipping when determing the eh-spec of a
+	  virtual destructor.  */
+	   && sfk != sfk_virtual_destructor)
 /* Vbase cdtors are not relevant.  */;
   else
 {
@@ -1748,6 +1747,9 @@ get_defaulted_eh_spec (tree decl, tsubst
   tree spec = empty_except_spec;
   bool diag = !DECL_DELETED_FN (decl) && (complain & tf_error);
   tree inh = DECL_INHERITED_CTOR (decl);
+  if (SFK_DTOR_P (sfk) && DECL_VIRTUAL_P (decl))
+/* We have to examine virtual bases even if abstract.  */
+sfk = sfk_virtual_destructor;
   synthesized_method_walk (ctype, sfk, const_p, , NULL, NULL,
 			   NULL, diag, , parms);
   return spec;
Index: 

Re: [PATCH, csky] Update dynamic linker'name

2018-11-15 Thread 瞿仙淼


Hi,
GNU_USER_DYNAMIC_LINKER is defined in linux.h, it will expand to 
GLIBC_DYNAMIC_LINKER when configured with glibc

> 在 2018年11月15日,下午6:30,Richard Biener  写道:
> 
> On Thu, Nov 15, 2018 at 7:02 AM 瞿仙淼  wrote:
>> 
>> Hi,
>>I have submitted a patch to update dynamic linker'name
>> 
>> 
>> Index: gcc/ChangeLog
>> ===
>> --- gcc/ChangeLog   (revision 266171)
>> +++ gcc/ChangeLog   (working copy)
>> @@ -1,3 +1,9 @@
>> +2018-11-15  Xianmiao Qu  
>> +
>> +   * config/csky/csky-linux-elf.h (LINUX_DYNAMIC_LINKER): Remove.
>> +   (GLIBC_DYNAMIC_LINKER): Define.
>> +   (LINUX_TARGET_LINK_SPEC): Update the dynamic linker's name.
>> +
>> 2018-11-15  Bin Cheng  
>> 
>>PR tree-optimization/84648
>> Index: gcc/config/csky/csky-linux-elf.h
>> ===
>> --- gcc/config/csky/csky-linux-elf.h(revision 266171)
>> +++ gcc/config/csky/csky-linux-elf.h(working copy)
>> @@ -61,7 +61,7 @@
>>   %{mvdsp:-mvdsp}  \
>>   "
>> 
>> -#define LINUX_DYNAMIC_LINKER  "/lib/ld.so.1"
>> +#define GLIBC_DYNAMIC_LINKER 
>> "/lib/ld-linux-cskyv2%{mhard-float:-hf}%{mbig-endian:-be}.so.1"
>> 
>> #define LINUX_TARGET_LINK_SPEC "%{h*} %{version:-v}\
>>%{b}\
>> @@ -70,7 +70,7 @@
>>%{symbolic:-Bsymbolic}  \
>>%{!static:  \
>>  %{rdynamic:-export-dynamic}   \
>> - %{!shared:-dynamic-linker " LINUX_DYNAMIC_LINKER "}}  \
>> + %{!shared:-dynamic-linker " GNU_USER_DYNAMIC_LINKER "}}   \
> 
> ^^^ GNU_USER_DYNAMIC_LINKER vs. GLIBC_DYNAMIC_LINKER
> 
>>-X  \
>>%{mbig-endian:-EB} %{mlittle-endian:-EL}\
>>%{EB:-EB} %{EL:-EL}"
>> Index: libgcc/ChangeLog
>> ===
>> --- libgcc/ChangeLog(revision 266171)
>> +++ libgcc/ChangeLog(working copy)
>> @@ -1,3 +1,7 @@
>> +2018-11-15  Xianmiao Qu  
>> +
>> +   * config/csky/linux-unwind.h: Fix coding style.
>> +
>> 2018-11-13  Xianmiao Qu  
>> 
>>* config/csky/linux-unwind.h (_sig_ucontext_t): Remove.
>> Index: libgcc/config/csky/linux-unwind.h
>> ===
>> --- libgcc/config/csky/linux-unwind.h   (revision 266171)
>> +++ libgcc/config/csky/linux-unwind.h   (working copy)
>> @@ -25,10 +25,8 @@
>> 
>> #ifndef inhibit_libc
>> 
>> -/*
>> - * Do code reading to identify a signal frame, and set the frame state data
>> - * appropriately.  See unwind-dw2.c for the structs.
>> - */
>> +/* Do code reading to identify a signal frame, and set the frame state data
>> +   appropriately.  See unwind-dw2.c for the structs.  */
>> 
>> #include 
>> #include 



[committed] doc/ux.texi: auto_diagnostic_group now does something

2018-11-15 Thread David Malcolm
As of r266186, auto_diagnostic_group is now user-visible
(via -fdiagnostics-format=json), so update the diagnostic
guidelines accordingly.

Committed to trunk as r266187.

gcc/ChangeLog:
* doc/ux.texi (Group logically-related diagnostics): Move
discussion of auto_diagnostic_group into this new subsection.
Give an example of where this grouping is used.
---
 gcc/doc/ux.texi | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/ux.texi b/gcc/doc/ux.texi
index 3b395f2..47b4492 100644
--- a/gcc/doc/ux.texi
+++ b/gcc/doc/ux.texi
@@ -376,12 +376,15 @@ where the @code{double} and @code{int} are colorized to 
highlight them.
 
 @c %H and %I were added in r248698.
 
+@subsection Group logically-related diagnostics
+
 Use @code{auto_diagnostic_group} when issuing multiple related
 diagnostics (seen in various examples on this page).  This informs the
 diagnostic subsystem that all diagnostics issued within the lifetime
-of the @code{auto_diagnostic_group} are related.  (Currently it doesn't
-do anything with this information, but we may implement that in the
-future).
+of the @code{auto_diagnostic_group} are related.  For example,
+@option{-fdiagnostics-format=json} will treat the first diagnostic
+emitted within the group as a top-level diagnostic, and all subsequent
+diagnostics within the group as its children.
 
 @subsection Quoting
 Text should be quoted by either using the @samp{q} modifier in a directive
-- 
1.8.5.3



Re: Don't use %z printf format length specified

2018-11-15 Thread Richard Biener
On Thu, Nov 15, 2018 at 2:31 PM Michael Matz  wrote:
>
> Hi,
>
> On Thu, 15 Nov 2018, Richard Biener wrote:
>
> > > Okay, probably.  Then consider the same patch with sprinkling casts to
> > > long for all these arguments (the actual numbers will fit that type in
> > > reality).  I could of course also use PRIu64 but that makes my eyes
> > > bleed.
> >
> > Hmm.  Can you use PRIu64 please and cast to uint64_t?  OK with that
> > change.
>
> Instead I'd like to check in this.  It adds a new macro expanding to
>   "%" #n PRIu64 "%c"
> (with n being the width) corresponding to one SIZE_AMOUNT macro "argument"
> (which are actually two arguments), and hides the cast in the latter macro
> itself.  At least the number of format string args and actual args
> corresponds again then and my eyes bleed a little less.  Still okay?

Works for me.

Richard.

>
> Ciao,
> Michael.
>
> * system.h (PRsa): New macro.
> (SIZE_AMOUNT): Cast number to uint64_t.
> * alloc-pool.h (pool_usage::dump): Don't use %zu but PRsa.
> (pool_usage::dump_footer): Likewise and also use PRIu64.
> * bitmap.h (bitmap_usage::dump): Likewise.
> * ggc-common.c (ggc_usage::dump): Likewise.
> * ggc-page.c (ggc_print_statistics): Likewise.
> * mem-stats.h (mem_usage::dump): Likewise.
> (mem_usage::dump_footer): Likewise.
> * rtl.c (dump_rtx_statistics): Likewise.
> * vec.c (vec_usage::dump): Likewise.
> (vec_usage::dump_footer): Likewise.
>
> diff --git a/gcc/alloc-pool.h b/gcc/alloc-pool.h
> index d17a05ca4fb1..81cb69e227ab 100644
> --- a/gcc/alloc-pool.h
> +++ b/gcc/alloc-pool.h
> @@ -63,8 +63,8 @@ struct pool_usage: public mem_usage
>{
>  char *location_string = loc->to_string ();
>
> -fprintf (stderr, "%-32s%-48s %5zu%c%9zu%c:%5.1f%%%9zu"
> -"%c%9zu%c:%5.1f%%%12zu\n",
> +fprintf (stderr, "%-32s%-48s " PRsa(5) PRsa(9) ":%5.1f%%"
> +PRsa(9) PRsa(9) ":%5.1f%%%12" PRIu64 "\n",
>  m_pool_name, location_string,
>  SIZE_AMOUNT (m_instances),
>  SIZE_AMOUNT (m_allocated),
> @@ -72,7 +72,7 @@ struct pool_usage: public mem_usage
>  SIZE_AMOUNT (m_peak),
>  SIZE_AMOUNT (m_times),
>  get_percent (m_times, total.m_times),
> -m_element_size);
> +(uint64_t)m_element_size);
>
>  free (location_string);
>}
> @@ -91,7 +91,7 @@ struct pool_usage: public mem_usage
>dump_footer ()
>{
>  print_dash_line ();
> -fprintf (stderr, "%s%82zu%c%10zu%c\n", "Total",
> +fprintf (stderr, "%s" PRsa(82) PRsa(10) "\n", "Total",
>  SIZE_AMOUNT (m_instances), SIZE_AMOUNT (m_allocated));
>  print_dash_line ();
>}
> diff --git a/gcc/bitmap.h b/gcc/bitmap.h
> index 973ea846baf1..9a180daa7454 100644
> --- a/gcc/bitmap.h
> +++ b/gcc/bitmap.h
> @@ -239,9 +239,9 @@ struct bitmap_usage: public mem_usage
>{
>  char *location_string = loc->to_string ();
>
> -fprintf (stderr, "%-48s %9zu%c:%5.1f%%"
> -"%9zu%c%9zu%c:%5.1f%%"
> -"%11" PRIu64 "%c%11" PRIu64 "%c%10s\n",
> +fprintf (stderr, "%-48s " PRsa (9) ":%5.1f%%"
> +PRsa (9) PRsa (9) ":%5.1f%%"
> +PRsa (11) PRsa (11) "%10s\n",
>  location_string, SIZE_AMOUNT (m_allocated),
>  get_percent (m_allocated, total.m_allocated),
>  SIZE_AMOUNT (m_peak), SIZE_AMOUNT (m_times),
> diff --git a/gcc/ggc-common.c b/gcc/ggc-common.c
> index 9fdba23ce4c2..c989fb01e669 100644
> --- a/gcc/ggc-common.c
> +++ b/gcc/ggc-common.c
> @@ -884,8 +884,8 @@ struct ggc_usage: public mem_usage
>{
>  size_t balance = get_balance ();
>  fprintf (stderr,
> -"%-48s %9zu%c:%5.1f%%%9zu%c:%5.1f%%"
> -"%9zu%c:%5.1f%%%9zu%c:%5.1f%%%9zu%c\n",
> +"%-48s " PRsa (9) ":%5.1f%%" PRsa (9) ":%5.1f%%"
> +PRsa (9) ":%5.1f%%" PRsa (9) ":%5.1f%%" PRsa (9) "\n",
>  prefix, SIZE_AMOUNT (m_collected),
>  get_percent (m_collected, total.m_collected),
>  SIZE_AMOUNT (m_freed), get_percent (m_freed, total.m_freed),
> diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
> index 00c2864711f0..f04b22ca8cca 100644
> --- a/gcc/ggc-page.c
> +++ b/gcc/ggc-page.c
> @@ -2288,14 +2288,15 @@ ggc_print_statistics (void)
>   overhead += (sizeof (page_entry) - sizeof (long)
>+ BITMAP_SIZE (OBJECTS_IN_PAGE (p) + 1));
> }
> -  fprintf (stderr, "%-8zu %10zu%c %10zu%c %10zu%c\n",
> -  OBJECT_SIZE (i),
> +  fprintf (stderr, "%-8" PRIu64 " " PRsa (10) " " PRsa (10) " "
> +  PRsa (10) "\n",
> +  (uint64_t)OBJECT_SIZE (i),
>SIZE_AMOUNT (allocated),
>SIZE_AMOUNT (in_use),
>SIZE_AMOUNT (overhead));
>total_overhead += overhead;
>  }
> -  fprintf (stderr, "%-8s %10zu%c %10zu%c %10zu%c\n",
> +  fprintf (stderr, "%-8s " PRsa 

Re: [PATCH 1/7][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf

2018-11-15 Thread Paul Koning



> On Nov 14, 2018, at 5:19 PM, Jozef Lawrynowicz  
> wrote:
> 
> On Wed, 14 Nov 2018 11:30:39 -0500
> Paul Koning  wrote:
> 
>>> On Nov 14, 2018, at 10:44 AM, Jozef Lawrynowicz 
>>> wrote:
>>> 
>>> Patch 1 tweaks dg directives in tests specifically for msp430. Many of
>>> these are extensions to existing target selectors in dg directives.
>>> 
>>> <0001-TESTSUITE-MSP430-Tweak-dg-directives-for-msp430-elf.patch>  
>> 
>> For pr41779.c, you have
>> 
>> +/* { dg-skip-if "int is smaller than float" { msp430-*-* } } */
>> 
>> I take it that means: sizeof(int) < sizeof(float).  That property also holds
>> for pdp11 and perhaps other targets.  Would it make sense to introduce a new
>> effective-target flag for that check instead?
>> 
>>  paul
>> 
> 
> Paul,
> 
> Yes you are correct the comment implies sizeof(int) < sizeof(float).
> 
> I believe this was the only test where this property affected the test
> results, so a new effective-target flag is probably only worth adding if it
> affects at least a couple of tests.
> On the other hand, I suppose there is no harm in adding another
> check-effective-target and it at least means we'll catch failures across more
> targets.
> 
> I'd be curious if the line I added the xfail to in c-c++-common/pr57371-2.c
> also fails for pdp11.
> 
> The conversion to float might be getting optimized out whenever
> sizeof(int) < sizeof(float).
> 
> Thanks,
> Jozef

Yes, that test on pr57371-2.c also fails on pdp11.

paul



Re: [C++ Patch] Fix ensure_literal_type_for_constexpr_object locations

2018-11-15 Thread Paolo Carlini

Hi,

On 15/11/18 15:34, Marek Polacek wrote:
How about using location_of instead of DECL_SOURCE_LOCATION? 


Well, we know that VAR_P is true of that DECL tree node, thus I don't 
see how location_of would be better. Certainly is more complex.


Paolo.



[committed] v2: Machine-readable diagnostic output (PR other/19165)

2018-11-15 Thread David Malcolm
Changed in v2:
* added test coverage
* improved docs
* pass orig_diag_kind to the finalizer callback

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Committed to trunk as r266186.

Blurb from v1:

This patch implements a -fdiagnostics-format=json option which
converts the diagnostics to be output to stderr in a JSON format;
see the documentation in invoke.texi.

Logically-related diagnostics are nested at the JSON level, using
the auto_diagnostic_group mechanism.

gcc/ChangeLog:
PR other/19165
* Makefile.in (OBJS): Move json.o to...
(OBJS-libcommon): ...here and add diagnostic-format-json.o.
* common.opt (fdiagnostics-format=): New option.
(diagnostics_output_format): New enum.
* diagnostic-format-json.cc: New file.
* diagnostic.c (default_diagnostic_final_cb): New function, taken
from start of diagnostic_finish.
(diagnostic_initialize): Initialize final_cb to
default_diagnostic_final_cb.
(diagnostic_finish): Move "being treated as errors" messages to
default_diagnostic_final_cb.  Call any final_cb.
(default_diagnostic_finalizer): Add diagnostic_t param.
(diagnostic_report_diagnostic): Pass "orig_diag_kind" to
diagnostic_finalizer callback.
* diagnostic.h (enum diagnostics_output_format): New enum.
(diagnostic_finalizer_fn): Reimplement, adding diagnostic_t param.
(struct diagnostic_context): Add "final_cb".
(default_diagnostic_finalizer): Add diagnostic_t param.
(diagnostic_output_format_init): New decl.
* doc/invoke.texi (-fdiagnostics-format): New option.
* dwarf2out.c (gen_producer_string): Ignore
OPT_fdiagnostics_format_.
* gcc.c (driver_handle_option): Handle OPT_fdiagnostics_format_.
* lto-wrapper.c (append_diag_options): Ignore it.
* opts.c (common_handle_option): Handle it.

gcc/c-family/ChangeLog:
PR other/19165
* c-opts.c (c_diagnostic_finalizer): Add diagnostic_t param.

gcc/fortran/ChangeLog:
PR other/19165
* error.c (gfc_diagnostic_finalizer): Add diagnostic_t param.

gcc/jit/ChangeLog:
PR other/19165
* dummy-frontend.c (jit_begin_diagnostic): Add diagnostic_t param.

gcc/testsuite/ChangeLog:
PR other/19165
* c-c++-common/diagnostic-format-json-1.c: New test.
* c-c++-common/diagnostic-format-json-2.c: New test.
* c-c++-common/diagnostic-format-json-3.c: New test.
* c-c++-common/diagnostic-format-json-4.c: New test.
* c-c++-common/diagnostic-format-json-5.c: New test.
* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
(custom_diagnostic_finalizer): Add diagnostic_t param.
* gcc.dg/plugin/location_overflow_plugin.c
(verify_unpacked_ranges): Likewise.
(verify_no_columns): Likewise.
* gfortran.dg/diagnostic-format-json-1.F90: New test.
* gfortran.dg/diagnostic-format-json-2.F90: New test.
* gfortran.dg/diagnostic-format-json-3.F90: New test.
---
 gcc/Makefile.in|   2 +-
 gcc/c-family/c-opts.c  |   3 +-
 gcc/common.opt |  17 ++
 gcc/diagnostic-format-json.cc  | 264 +
 gcc/diagnostic.c   |  45 ++--
 gcc/diagnostic.h   |  23 +-
 gcc/doc/invoke.texi| 188 +++
 gcc/dwarf2out.c|   1 +
 gcc/fortran/error.c|   3 +-
 gcc/gcc.c  |   5 +
 gcc/jit/dummy-frontend.c   |   3 +-
 gcc/lto-wrapper.c  |   1 +
 gcc/opts.c |   5 +
 .../c-c++-common/diagnostic-format-json-1.c|  25 ++
 .../c-c++-common/diagnostic-format-json-2.c|  26 ++
 .../c-c++-common/diagnostic-format-json-3.c|  26 ++
 .../c-c++-common/diagnostic-format-json-4.c|  55 +
 .../c-c++-common/diagnostic-format-json-5.c|  46 
 .../plugin/diagnostic_plugin_test_show_locus.c |   3 +-
 .../gcc.dg/plugin/location_overflow_plugin.c   |  10 +-
 .../gfortran.dg/diagnostic-format-json-1.F90   |  25 ++
 .../gfortran.dg/diagnostic-format-json-2.F90   |  26 ++
 .../gfortran.dg/diagnostic-format-json-3.F90   |  26 ++
 23 files changed, 800 insertions(+), 28 deletions(-)
 create mode 100644 gcc/diagnostic-format-json.cc
 create mode 100644 gcc/testsuite/c-c++-common/diagnostic-format-json-1.c
 create mode 100644 gcc/testsuite/c-c++-common/diagnostic-format-json-2.c
 create mode 100644 gcc/testsuite/c-c++-common/diagnostic-format-json-3.c
 create mode 100644 gcc/testsuite/c-c++-common/diagnostic-format-json-4.c
 create mode 100644 

Re: [C++ Patch] Fix ensure_literal_type_for_constexpr_object locations

2018-11-15 Thread Marek Polacek
On Thu, Nov 15, 2018 at 02:39:14PM +0100, Paolo Carlini wrote:
> Hi,
> 
> this one should be straightforward. Tested x86_64-linux.
> 
> Thanks, Paolo.
> 
> /
> 

> /cp
> 2018-11-15  Paolo Carlini  
> 
>   * constexpr.c (ensure_literal_type_for_constexpr_object): Use
>   DECL_SOURCE_LOCATION in error_at calls.
> 
> /testsuite
> 2018-11-15  Paolo Carlini  
> 
>   * g++.dg/cpp0x/constexpr-diag3.C: Check locations too.
>   * g++.dg/cpp0x/constexpr-ice19.C: Likewise.
>   * g++.dg/cpp0x/constexpr-nonlit2.C: Likewise.
>   * g++.dg/cpp1z/constexpr-lambda15.C: Likewise.
>   * g++.dg/ext/constexpr-vla5.C: Likewise.
>   * g++.dg/gomp/pr85134.C: Likewise.

> Index: cp/constexpr.c
> ===
> --- cp/constexpr.c(revision 266174)
> +++ cp/constexpr.c(working copy)
> @@ -98,8 +98,9 @@ ensure_literal_type_for_constexpr_object (tree dec
> if (DECL_DECLARED_CONSTEXPR_P (decl))
>   {
> auto_diagnostic_group d;
> -   error ("the type %qT of % variable %qD "
> -  "is not literal", type, decl);
> +   error_at (DECL_SOURCE_LOCATION (decl),
> + "the type %qT of % variable %qD "
> + "is not literal", type, decl);
> explain_non_literal_class (type);
> decl = error_mark_node;
>   }
> @@ -108,8 +109,9 @@ ensure_literal_type_for_constexpr_object (tree dec
> if (!is_instantiation_of_constexpr (current_function_decl))
>   {
> auto_diagnostic_group d;
> -   error ("variable %qD of non-literal type %qT in % 
> "
> -  "function", decl, type);
> +   error_at (DECL_SOURCE_LOCATION (decl),
> + "variable %qD of non-literal type %qT in "
> + "% function", decl, type);
> explain_non_literal_class (type);
> decl = error_mark_node;
>   }
> @@ -119,8 +121,9 @@ ensure_literal_type_for_constexpr_object (tree dec
>else if (DECL_DECLARED_CONSTEXPR_P (decl)
>  && variably_modified_type_p (type, NULL_TREE))
>   {
> -   error ("% variable %qD has variably-modified type %qT",
> -  decl, type);
> +   error_at (DECL_SOURCE_LOCATION (decl),
> + "% variable %qD has variably-modified "
> + "type %qT", decl, type);
> decl = error_mark_node;

How about using location_of instead of DECL_SOURCE_LOCATION?

Marek


[committed] graphite: add missing dump_enabled_p checks (PR tree-optimization/88015)

2018-11-15 Thread David Malcolm
As of r266080, calls to dump_print* that aren't guarded by
  if (dump_enabled_p ())
lead to an assertion failure.

This patch fixes a few calls that weren't guarded, avoiding a
call to find_loop_location for the no-dumping case.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

Committed to trunk as r266184, under the "obvious" rule.

gcc/ChangeLog:
PR tree-optimization/88015
* graphite-isl-ast-to-gimple.c
(translate_isl_ast_to_gimple::scop_to_isl_ast): Add missing check
for dump_enabled_p.
* graphite-sese-to-poly.c (build_poly_scop): Likewise.
---
 gcc/graphite-isl-ast-to-gimple.c | 25 ++---
 gcc/graphite-sese-to-poly.c  |  3 ++-
 2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c
index e2dbf6f..cc93fb9 100644
--- a/gcc/graphite-isl-ast-to-gimple.c
+++ b/gcc/graphite-isl-ast-to-gimple.c
@@ -1411,17 +1411,20 @@ scop_to_isl_ast (scop_p scop)
   isl_ctx_set_max_operations (scop->isl_context, old_max_operations);
   if (isl_ctx_last_error (scop->isl_context) != isl_error_none)
 {
-  dump_user_location_t loc = find_loop_location
-   (scop->scop_info->region.entry->dest->loop_father);
-  if (isl_ctx_last_error (scop->isl_context) == isl_error_quota)
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc,
-"loop nest not optimized, AST generation timed out "
-"after %d operations [--param max-isl-operations]\n",
-max_operations);
-  else
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc,
-"loop nest not optimized, ISL AST generation "
-"signalled an error\n");
+  if (dump_enabled_p ())
+   {
+ dump_user_location_t loc = find_loop_location
+   (scop->scop_info->region.entry->dest->loop_father);
+ if (isl_ctx_last_error (scop->isl_context) == isl_error_quota)
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc,
+"loop nest not optimized, AST generation timed out 
"
+"after %d operations [--param 
max-isl-operations]\n",
+max_operations);
+ else
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc,
+"loop nest not optimized, ISL AST generation "
+"signalled an error\n");
+   }
   isl_ast_node_free (ast_isl);
   return NULL;
 }
diff --git a/gcc/graphite-sese-to-poly.c b/gcc/graphite-sese-to-poly.c
index 69898d4..1d41cff 100644
--- a/gcc/graphite-sese-to-poly.c
+++ b/gcc/graphite-sese-to-poly.c
@@ -1218,7 +1218,8 @@ build_poly_scop (scop_p scop)
   enum isl_error err = isl_ctx_last_error (scop->isl_context);
   isl_ctx_reset_error (scop->isl_context);
   isl_options_set_on_error (scop->isl_context, old_err);
-  if (err != isl_error_none)
+  if (err != isl_error_none
+  && dump_enabled_p ())
 dump_printf (MSG_MISSED_OPTIMIZATION,
 "ISL error while building poly scop\n");
 
-- 
1.8.5.3



[PATCH] Fix PR88031

2018-11-15 Thread Richard Biener


With one of my last changes we regressed here so this goes all the
way cleaning up things so we only have a single flag to
vectorizable_condition whetehr we are called from reduction context.
In theory the !multiple-types restriction could be easily lifted now
(just remove the check).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2018-11-15  Richard Biener  

PR tree-optimization/88031
* tree-vect-loop.c (vectorizable_reduction): Move check
for multiple types earlier so we get the expected dump.
Simplify calls to vectorizable_condition.
* tree-vect-stmts.h (vectorizable_condition): Update prototype.
* tree-vect-stmts.c (vectorizable_condition): Instead of
reduc_def and reduc_index take just a flag.  Simplify
code-generation now that we can rely on the defs being set up.
(vectorizable_comparison): Remove unused argument.

* gcc.dg/pr88031.c: New testcase.


diff --git a/gcc/testsuite/gcc.dg/pr88031.c b/gcc/testsuite/gcc.dg/pr88031.c
new file mode 100644
index 000..2c1d1c1391d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr88031.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+int a[512];
+int b;
+void d()
+{
+  unsigned char c;
+  for (; b; b++) {
+  c = 1;
+  for (; c; c <<= 1) {
+ a[b] <<= 8;
+ if (b & c)
+   a[b] = 1;
+  }
+  }
+}
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index eb01acdf717..88b980bb9d7 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6531,14 +6531,24 @@ vectorizable_reduction (stmt_vec_info stmt_info, 
gimple_stmt_iterator *gsi,
 double_reduc = true;
 }
 
+  vect_reduction_type reduction_type
+= STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info);
+  if ((double_reduc || reduction_type != TREE_CODE_REDUCTION)
+  && ncopies > 1)
+{
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"multiple types in double reduction or condition "
+"reduction.\n");
+  return false;
+}
+
   if (code == COND_EXPR)
 {
   /* Only call during the analysis stage, otherwise we'll lose
-STMT_VINFO_TYPE.  We'll pass ops[0] as reduc_op, it's only
-used as a flag during analysis.  */
+STMT_VINFO_TYPE.  */
   if (!vec_stmt && !vectorizable_condition (stmt_info, gsi, NULL,
-   ops[0], 0, NULL,
-   cost_vec))
+   true, NULL, cost_vec))
 {
   if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -6638,8 +6648,6 @@ vectorizable_reduction (stmt_vec_info stmt_info, 
gimple_stmt_iterator *gsi,
   (and also the same tree-code) when generating the epilog code and
   when generating the code inside the loop.  */
 
-  vect_reduction_type reduction_type
-= STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info);
   if (orig_stmt_info
   && (reduction_type == TREE_CODE_REDUCTION
  || reduction_type == FOLD_LEFT_REDUCTION))
@@ -6729,16 +6737,6 @@ vectorizable_reduction (stmt_vec_info stmt_info, 
gimple_stmt_iterator *gsi,
   return false;
 }
 
-  if ((double_reduc || reduction_type != TREE_CODE_REDUCTION)
-  && ncopies > 1)
-{
-  if (dump_enabled_p ())
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-"multiple types in double reduction or condition "
-"reduction.\n");
-  return false;
-}
-
   /* For SLP reductions, see if there is a neutral value we can use.  */
   tree neutral_op = NULL_TREE;
   if (slp_node)
@@ -7003,7 +7001,7 @@ vectorizable_reduction (stmt_vec_info stmt_info, 
gimple_stmt_iterator *gsi,
 {
   gcc_assert (!slp_node);
   return vectorizable_condition (stmt_info, gsi, vec_stmt,
-NULL, reduc_index, NULL, NULL);
+true, NULL, NULL);
 }
 
   /* Create the destination vector  */
@@ -7035,9 +7033,7 @@ vectorizable_reduction (stmt_vec_info stmt_info, 
gimple_stmt_iterator *gsi,
 {
   gcc_assert (!slp_node);
  vectorizable_condition (stmt_info, gsi, vec_stmt,
- PHI_RESULT (phis[0]->stmt),
- reduc_index, NULL, NULL);
-  /* Multiple types are not supported for condition.  */
+ true, NULL, NULL);
   break;
 }
   if (code == LSHIFT_EXPR
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 74646570e2a..a67a7f4e348 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -8675,17 +8675,14 @@ vect_is_simple_cond (tree cond, vec_info *vinfo,
stmt using VEC_COND_EXPR  to replace it, put it in VEC_STMT, and insert it
at 

[C++ Patch] Fix ensure_literal_type_for_constexpr_object locations

2018-11-15 Thread Paolo Carlini

Hi,

this one should be straightforward. Tested x86_64-linux.

Thanks, Paolo.

/

/cp
2018-11-15  Paolo Carlini  

* constexpr.c (ensure_literal_type_for_constexpr_object): Use
DECL_SOURCE_LOCATION in error_at calls.

/testsuite
2018-11-15  Paolo Carlini  

* g++.dg/cpp0x/constexpr-diag3.C: Check locations too.
* g++.dg/cpp0x/constexpr-ice19.C: Likewise.
* g++.dg/cpp0x/constexpr-nonlit2.C: Likewise.
* g++.dg/cpp1z/constexpr-lambda15.C: Likewise.
* g++.dg/ext/constexpr-vla5.C: Likewise.
* g++.dg/gomp/pr85134.C: Likewise.
Index: cp/constexpr.c
===
--- cp/constexpr.c  (revision 266174)
+++ cp/constexpr.c  (working copy)
@@ -98,8 +98,9 @@ ensure_literal_type_for_constexpr_object (tree dec
  if (DECL_DECLARED_CONSTEXPR_P (decl))
{
  auto_diagnostic_group d;
- error ("the type %qT of % variable %qD "
-"is not literal", type, decl);
+ error_at (DECL_SOURCE_LOCATION (decl),
+   "the type %qT of % variable %qD "
+   "is not literal", type, decl);
  explain_non_literal_class (type);
  decl = error_mark_node;
}
@@ -108,8 +109,9 @@ ensure_literal_type_for_constexpr_object (tree dec
  if (!is_instantiation_of_constexpr (current_function_decl))
{
  auto_diagnostic_group d;
- error ("variable %qD of non-literal type %qT in % 
"
-"function", decl, type);
+ error_at (DECL_SOURCE_LOCATION (decl),
+   "variable %qD of non-literal type %qT in "
+   "% function", decl, type);
  explain_non_literal_class (type);
  decl = error_mark_node;
}
@@ -119,8 +121,9 @@ ensure_literal_type_for_constexpr_object (tree dec
   else if (DECL_DECLARED_CONSTEXPR_P (decl)
   && variably_modified_type_p (type, NULL_TREE))
{
- error ("% variable %qD has variably-modified type %qT",
-decl, type);
+ error_at (DECL_SOURCE_LOCATION (decl),
+   "% variable %qD has variably-modified "
+   "type %qT", decl, type);
  decl = error_mark_node;
}
 }
Index: testsuite/g++.dg/cpp0x/constexpr-diag3.C
===
--- testsuite/g++.dg/cpp0x/constexpr-diag3.C(revision 266174)
+++ testsuite/g++.dg/cpp0x/constexpr-diag3.C(working copy)
@@ -24,7 +24,7 @@ struct complex// { dg-message "no 
.constexpr.
   double im;
 };
 
-constexpr complex co1(0, 1);  // { dg-error "not literal" }
+constexpr complex co1(0, 1);  // { dg-error "19:the type .const complex. 
of .constexpr. variable .co1. is not literal" }
 constexpr double dd2 = co1.real(); // { dg-error "|in .constexpr. expansion of 
" }
 
 // 
Index: testsuite/g++.dg/cpp0x/constexpr-ice19.C
===
--- testsuite/g++.dg/cpp0x/constexpr-ice19.C(revision 266174)
+++ testsuite/g++.dg/cpp0x/constexpr-ice19.C(working copy)
@@ -9,5 +9,6 @@ struct A
 
 struct B
 {
-  static constexpr A a {};  // { dg-error "not literal|in-class 
initialization" }
+  static constexpr A a {};  // { dg-error "22:the type .const A. of 
.constexpr. variable .B::a. is not literal" }
+// { dg-error "in-class initialization" "" { target c++11 } .-1 }
 };
Index: testsuite/g++.dg/cpp0x/constexpr-nonlit2.C
===
--- testsuite/g++.dg/cpp0x/constexpr-nonlit2.C  (revision 266174)
+++ testsuite/g++.dg/cpp0x/constexpr-nonlit2.C  (working copy)
@@ -16,4 +16,4 @@ template 
 constexpr W make_w(T& w) { return W(w); }
 
 A a;
-constexpr auto w = make_w(a);  // { dg-error "" }
+constexpr auto w = make_w(a);  // { dg-error "16:the type .const W. of 
.constexpr. variable .w. is not literal" }
Index: testsuite/g++.dg/cpp1z/constexpr-lambda15.C
===
--- testsuite/g++.dg/cpp1z/constexpr-lambda15.C (revision 266174)
+++ testsuite/g++.dg/cpp1z/constexpr-lambda15.C (working copy)
@@ -3,7 +3,7 @@
 
 struct S {
   constexpr S(int i) {
-auto f = [i]{};// { dg-error "literal" "" { target c++14_only 
} }
+auto f = [i]{};// { dg-error "10:variable .f. of non-literal 
type" "" { target c++14_only } }
   }
 };
 int main() {}
Index: testsuite/g++.dg/ext/constexpr-vla5.C
===
--- testsuite/g++.dg/ext/constexpr-vla5.C   (revision 266174)
+++ testsuite/g++.dg/ext/constexpr-vla5.C   (working copy)
@@ -3,5 +3,6 @@
 
 void foo(int i)
 {
-  constexpr char x[i] = "";// { dg-error 

Re: Don't use %z printf format length specified

2018-11-15 Thread Michael Matz
Hi,

On Thu, 15 Nov 2018, Richard Biener wrote:

> > Okay, probably.  Then consider the same patch with sprinkling casts to 
> > long for all these arguments (the actual numbers will fit that type in 
> > reality).  I could of course also use PRIu64 but that makes my eyes 
> > bleed.
> 
> Hmm.  Can you use PRIu64 please and cast to uint64_t?  OK with that 
> change.

Instead I'd like to check in this.  It adds a new macro expanding to 
  "%" #n PRIu64 "%c"
(with n being the width) corresponding to one SIZE_AMOUNT macro "argument" 
(which are actually two arguments), and hides the cast in the latter macro 
itself.  At least the number of format string args and actual args 
corresponds again then and my eyes bleed a little less.  Still okay?


Ciao,
Michael.

* system.h (PRsa): New macro.
(SIZE_AMOUNT): Cast number to uint64_t.
* alloc-pool.h (pool_usage::dump): Don't use %zu but PRsa.
(pool_usage::dump_footer): Likewise and also use PRIu64.
* bitmap.h (bitmap_usage::dump): Likewise.
* ggc-common.c (ggc_usage::dump): Likewise.
* ggc-page.c (ggc_print_statistics): Likewise.
* mem-stats.h (mem_usage::dump): Likewise.
(mem_usage::dump_footer): Likewise.
* rtl.c (dump_rtx_statistics): Likewise.
* vec.c (vec_usage::dump): Likewise.
(vec_usage::dump_footer): Likewise.

diff --git a/gcc/alloc-pool.h b/gcc/alloc-pool.h
index d17a05ca4fb1..81cb69e227ab 100644
--- a/gcc/alloc-pool.h
+++ b/gcc/alloc-pool.h
@@ -63,8 +63,8 @@ struct pool_usage: public mem_usage
   {
 char *location_string = loc->to_string ();
 
-fprintf (stderr, "%-32s%-48s %5zu%c%9zu%c:%5.1f%%%9zu"
-"%c%9zu%c:%5.1f%%%12zu\n",
+fprintf (stderr, "%-32s%-48s " PRsa(5) PRsa(9) ":%5.1f%%"
+PRsa(9) PRsa(9) ":%5.1f%%%12" PRIu64 "\n",
 m_pool_name, location_string,
 SIZE_AMOUNT (m_instances),
 SIZE_AMOUNT (m_allocated),
@@ -72,7 +72,7 @@ struct pool_usage: public mem_usage
 SIZE_AMOUNT (m_peak),
 SIZE_AMOUNT (m_times),
 get_percent (m_times, total.m_times),
-m_element_size);
+(uint64_t)m_element_size);
 
 free (location_string);
   }
@@ -91,7 +91,7 @@ struct pool_usage: public mem_usage
   dump_footer ()
   {
 print_dash_line ();
-fprintf (stderr, "%s%82zu%c%10zu%c\n", "Total",
+fprintf (stderr, "%s" PRsa(82) PRsa(10) "\n", "Total",
 SIZE_AMOUNT (m_instances), SIZE_AMOUNT (m_allocated));
 print_dash_line ();
   }
diff --git a/gcc/bitmap.h b/gcc/bitmap.h
index 973ea846baf1..9a180daa7454 100644
--- a/gcc/bitmap.h
+++ b/gcc/bitmap.h
@@ -239,9 +239,9 @@ struct bitmap_usage: public mem_usage
   {
 char *location_string = loc->to_string ();
 
-fprintf (stderr, "%-48s %9zu%c:%5.1f%%"
-"%9zu%c%9zu%c:%5.1f%%"
-"%11" PRIu64 "%c%11" PRIu64 "%c%10s\n",
+fprintf (stderr, "%-48s " PRsa (9) ":%5.1f%%"
+PRsa (9) PRsa (9) ":%5.1f%%"
+PRsa (11) PRsa (11) "%10s\n",
 location_string, SIZE_AMOUNT (m_allocated),
 get_percent (m_allocated, total.m_allocated),
 SIZE_AMOUNT (m_peak), SIZE_AMOUNT (m_times),
diff --git a/gcc/ggc-common.c b/gcc/ggc-common.c
index 9fdba23ce4c2..c989fb01e669 100644
--- a/gcc/ggc-common.c
+++ b/gcc/ggc-common.c
@@ -884,8 +884,8 @@ struct ggc_usage: public mem_usage
   {
 size_t balance = get_balance ();
 fprintf (stderr,
-"%-48s %9zu%c:%5.1f%%%9zu%c:%5.1f%%"
-"%9zu%c:%5.1f%%%9zu%c:%5.1f%%%9zu%c\n",
+"%-48s " PRsa (9) ":%5.1f%%" PRsa (9) ":%5.1f%%"
+PRsa (9) ":%5.1f%%" PRsa (9) ":%5.1f%%" PRsa (9) "\n",
 prefix, SIZE_AMOUNT (m_collected),
 get_percent (m_collected, total.m_collected),
 SIZE_AMOUNT (m_freed), get_percent (m_freed, total.m_freed),
diff --git a/gcc/ggc-page.c b/gcc/ggc-page.c
index 00c2864711f0..f04b22ca8cca 100644
--- a/gcc/ggc-page.c
+++ b/gcc/ggc-page.c
@@ -2288,14 +2288,15 @@ ggc_print_statistics (void)
  overhead += (sizeof (page_entry) - sizeof (long)
   + BITMAP_SIZE (OBJECTS_IN_PAGE (p) + 1));
}
-  fprintf (stderr, "%-8zu %10zu%c %10zu%c %10zu%c\n",
-  OBJECT_SIZE (i),
+  fprintf (stderr, "%-8" PRIu64 " " PRsa (10) " " PRsa (10) " "
+  PRsa (10) "\n",
+  (uint64_t)OBJECT_SIZE (i),
   SIZE_AMOUNT (allocated),
   SIZE_AMOUNT (in_use),
   SIZE_AMOUNT (overhead));
   total_overhead += overhead;
 }
-  fprintf (stderr, "%-8s %10zu%c %10zu%c %10zu%c\n",
+  fprintf (stderr, "%-8s " PRsa (10) " " PRsa (10) " " PRsa (10) "\n",
   "Total",
   SIZE_AMOUNT (G.bytes_mapped),
   SIZE_AMOUNT (G.allocated),
@@ -2306,42 +2307,42 @@ ggc_print_statistics (void)
   fprintf (stderr, "\nTotal allocations and overheads during "
   "the compilation process\n");

[C++ DR 2336] Clean up synth walkers first

2018-11-15 Thread Nathan Sidwell
This patch cleans up walk_field_subobs, synthesized_method_base_walk & 
synthesized_method_walk.  They use a mixture of old-style 
partially-overlapping booleans /and/ new style special_function_kind 
enumeration.  synthesized_method_walk was determining the booleans from 
sfk and then passing the booleans to synthesized_method_base_walk, but 
both the bools and sfk to walk_field_subobs.  Plus for extra confusion 
walk_field_subobs & synthesized_method_base_walk wanted the same args, 
but in different orders.


All this confusion gets in the way of resolving DR 2336.  This patch 
removes that confusion.  I reorder SFK members to be useful for these 
functions, add a set of new predicates and then adjust the 3 functions 
to use those predicates, drop the bools and consistently order their 
parameters.


Applying to trunk after an x6_64-linux bootstrap.

nathan

--
Nathan Sidwell
2018-11-15  Nathan Sidwell  

	* cp-tree.h (enum special_function_kind): Reorder and comment.
	* method.c (SFK_CTOR_P, SFK_DTOR_P, SFK_ASSIGN_P, SFK_COPY_P)
	(SFK_MOVE_P): New predicates.
	(walk_field_subobs, synthesized_method_base_walk): Drop
	copy_arg_p, move_p, assign_p args.  Use new SFK predicates.  Order
	parameters consistently.
	(synthesized_method_walk): Drop ctor_p, copy_arg_p, move_p,
	assign_p calculations.  Use new SFK predicates.  Adjust calls to
	worker functions.

Index: cp-tree.h
===
--- cp-tree.h	(revision 266161)
+++ cp-tree.h	(working copy)
@@ -5084,20 +5084,22 @@ enum special_function_kind {
   sfk_none = 0,		   /* Not a special function.  This enumeral
 			  must have value zero; see
 			  special_function_p.  */
+  /* The following are ordered, for use by member synthesis fns.  */
+  sfk_destructor,	   /* A destructor.  */
   sfk_constructor,	   /* A constructor.  */
+  sfk_inheriting_constructor, /* An inheriting constructor */
   sfk_copy_constructor,/* A copy constructor.  */
   sfk_move_constructor,/* A move constructor.  */
   sfk_copy_assignment, /* A copy assignment operator.  */
   sfk_move_assignment, /* A move assignment operator.  */
-  sfk_destructor,	   /* A destructor.  */
+  /* The following are unordered.  */
   sfk_complete_destructor, /* A destructor for complete objects.  */
   sfk_base_destructor, /* A destructor for base subobjects.  */
   sfk_deleting_destructor, /* A destructor for complete objects that
 			  deletes the object after it has been
 			  destroyed.  */
   sfk_conversion,	   /* A conversion operator.  */
-  sfk_deduction_guide,	   /* A class template deduction guide.  */
-  sfk_inheriting_constructor /* An inheriting constructor */
+  sfk_deduction_guide	   /* A class template deduction guide.  */
 };
 
 /* The various kinds of linkage.  From [basic.link],
Index: method.c
===
--- method.c	(revision 266161)
+++ method.c	(working copy)
@@ -1283,15 +1283,26 @@ process_subob_fn (tree fn, tree *spec_p,
 }
 }
 
+/* Categorize various special_function_kinds.  */
+#define SFK_CTOR_P(sfk) \
+  ((sfk) >= sfk_constructor && (sfk) <= sfk_move_constructor)
+#define SFK_DTOR_P(sfk) \
+  ((sfk) == sfk_destructor)
+#define SFK_ASSIGN_P(sfk) \
+  ((sfk) == sfk_copy_assignment || (sfk) == sfk_move_assignment)
+#define SFK_COPY_P(sfk) \
+  ((sfk) == sfk_copy_constructor || (sfk) == sfk_copy_assignment)
+#define SFK_MOVE_P(sfk) \
+  ((sfk) == sfk_move_constructor || (sfk) == sfk_move_assignment)
+
 /* Subroutine of synthesized_method_walk to allow recursion into anonymous
aggregates.  If DTOR_FROM_CTOR is true, we're walking subobject destructors
called from a synthesized constructor, in which case we don't consider
the triviality of the subobject destructor.  */
 
 static void
-walk_field_subobs (tree fields, tree fnname, special_function_kind sfk,
-		   int quals, bool copy_arg_p, bool move_p,
-		   bool assign_p, tree *spec_p, bool *trivial_p,
+walk_field_subobs (tree fields, special_function_kind sfk, tree fnname,
+		   int quals, tree *spec_p, bool *trivial_p,
 		   bool *deleted_p, bool *constexpr_p,
 		   bool diag, int flags, tsubst_flags_t complain,
 		   bool dtor_from_ctor)
@@ -1315,7 +1326,7 @@ walk_field_subobs (tree fields, tree fnn
 	break;
 
   mem_type = strip_array_types (TREE_TYPE (field));
-  if (assign_p)
+  if (SFK_ASSIGN_P (sfk))
 	{
 	  bool bad = true;
 	  if (CP_TYPE_CONST_P (mem_type) && !CLASS_TYPE_P (mem_type))
@@ -1419,19 +1430,18 @@ walk_field_subobs (tree fields, tree fnn
 
   if (ANON_AGGR_TYPE_P (mem_type))
 	{
-	  walk_field_subobs (TYPE_FIELDS (mem_type), fnname, sfk, quals,
-			 copy_arg_p, move_p, assign_p, spec_p, trivial_p,
-			 deleted_p, constexpr_p,
+	  walk_field_subobs (TYPE_FIELDS (mem_type), sfk, fnname, quals,
+			 spec_p, trivial_p, deleted_p, constexpr_p,
 			 diag, flags, complain, dtor_from_ctor);
 	  continue;
 	}
 
-  if 

RE: [PATCH 1/9][GCC][AArch64][middle-end] Implement SLP recognizer for Complex addition with rotate and complex MLA with rotation

2018-11-15 Thread Richard Biener
On Wed, 14 Nov 2018, Tamar Christina wrote:

> Hi Richard,
> 
> Thanks for the feedback, I've replied inline below.
> I'll wait for your answers before making changes.

I have commented on the other mail so will leave out redunant parts
here.

> > -Original Message-
> > From: Richard Biener 
> > Sent: Wednesday, November 14, 2018 12:21
> > To: Tamar Christina 
> > Cc: GCC Patches ; nd ; Richard
> > Guenther ; Zdenek Dvorak ; Richard
> > Earnshaw ; James Greenhalgh
> > ; Marcus Shawcroft
> > 
> > Subject: Re: [PATCH 1/9][GCC][AArch64][middle-end] Implement SLP
> > recognizer for Complex addition with rotate and complex MLA with rotation
> > 
> > On Sun, Nov 11, 2018 at 11:26 AM Tamar Christina
> >  wrote:
> > >
> > > Hi All,
> > >
> > > This patch adds support for SLP vectorization of Complex number
> > > arithmetic with rotations along with Argand plane.
> > >
> > > For this to work it has to recognize two statements in parallel as it
> > > needs to match against operations towards both the real and imaginary
> > > numbers.  The instructions generated by this change is also only
> > > available in their vector forms.  As such I add them as pattern 
> > > statements to
> > the stmt.  The BB is left untouched and so the scalar loop is untouched.
> > >
> > > The instructions also require the loads to be contiguous and so when a
> > > match is made, and the code decides it is able to do the replacement
> > > it re-organizes the SLP tree such that the loads are contiguous.
> > > Since this operation cannot be undone it only does this if it's sure that 
> > > the
> > resulting loads can be made continguous.
> > >
> > > It gets this guarantee by only allowing the replacement if between the
> > > matched expression and the loads there are no other expressions it
> > doesn't know aside from type casts.
> > >
> > > When a match occurs over multiple expressions, the dead statements are
> > > immediately removed from the tree to prevent verification failures later.
> > >
> > > Because the pattern matching is done after SLP analysis has analyzed
> > > the usage of the instruction it also marks the instructions as used and 
> > > the
> > old ones as unusued.
> > >
> > > When a replacement is done a new internal function is generated which
> > > the back-end has to expand to the proper instruction sequences.
> > >
> > > For now, this patch only adds support for Complex addition with rotate
> > > and Complex FMLA with rotation of 0 and 180. However it is the
> > > intention to in the future add support for Complex subtraction and
> > Complex multiplication.
> > >
> > > Concretely, this generates
> > >
> > > ldr q1, [x1, x3]
> > > ldr q2, [x0, x3]
> > > ldr q0, [x2, x3]
> > > fcmla   v0.2d, v1.2d, v2.2d, #180
> > > fcmla   v0.2d, v1.2d, v2.2d, #90
> > > str q0, [x2, x3]
> > > add x3, x3, 16
> > > cmp x3, 3200
> > > bne .L2
> > > ret
> > >
> > > now instead of
> > >
> > > add x3, x2, 31
> > > sub x4, x3, x1
> > > sub x3, x3, x0
> > > cmp x4, 62
> > > mov x4, 62
> > > ccmpx3, x4, 0, hi
> > > bls .L5
> > > mov x3, x0
> > > mov x0, x1
> > > add x1, x2, 3200
> > > .p2align 3,,7
> > > .L3:
> > > ld2 {v16.2d - v17.2d}, [x2]
> > > ld2 {v2.2d - v3.2d}, [x3], 32
> > > ld2 {v0.2d - v1.2d}, [x0], 32
> > > mov v7.16b, v17.16b
> > > fmulv6.2d, v0.2d, v3.2d
> > > fmlav7.2d, v1.2d, v3.2d
> > > fmlav6.2d, v1.2d, v2.2d
> > > fmlsv7.2d, v2.2d, v0.2d
> > > faddv4.2d, v6.2d, v16.2d
> > > mov v5.16b, v7.16b
> > > st2 {v4.2d - v5.2d}, [x2], 32
> > > cmp x2, x1
> > > bne .L3
> > > ret
> > > .L5:
> > > add x4, x2, 8
> > > add x6, x0, 8
> > > add x5, x1, 8
> > > mov x3, 0
> > > .p2align 3,,7
> > > .L2:
> > > ldr d1, [x6, x3]
> > > ldr d4, [x1, x3]
> > > ldr d5, [x5, x3]
> > > ldr d3, [x0, x3]
> > > fmuld2, d4, d1
> > > ldr d0, [x4, x3]
> > > fmadd   d0, d5, d1, d0
> > > ldr d1, [x2, x3]
> > > fmadd   d2, d5, d3, d2
> > > fmsub   d0, d4, d3, d0
> > > faddd1, d1, d2
> > > str d1, [x2, x3]
> > > str d0, [x4, x3]
> > > add x3, x3, 16
> > > cmp x3, 3200
> > > bne .L2
> > > ret
> > >
> > > Bootstrap and Regtests on aarch64-none-linux-gnu, arm-none-gnueabihf
> > > and x86_64-pc-linux-gnu are still on going but previous patch showed no
> > regressions.
> > >
> > > The instructions have also been tested on aarch64-none-elf and
> > > arm-none-eabi on a Armv8.3-a model and -march=Armv8.3-a+fp16 and all
> > tests pass.

Re: [PATCH 8/9][GCC][Arm] Add autovectorization support for complex multiplication and addition

2018-11-15 Thread Richard Biener
On Thu, Nov 15, 2018 at 1:42 PM Richard Biener
 wrote:
>
> On Wed, Nov 14, 2018 at 4:47 PM Tamar Christina  
> wrote:
> >
> > Hi Richard,
> >
> > > > Ok for trunk?
> > >
> > > +;; The complex mla operations always need to expand to two instructions.
> > > +;; The first operation does half the computation and the second does
> > > +the ;; remainder.  Because of this, expand early.
> > > +(define_expand "fcmla4"
> > > +  [(set (match_operand:VF 0 "register_operand")
> > > +   (plus:VF (match_operand:VF 1 "register_operand")
> > > +(unspec:VF [(match_operand:VF 2 "register_operand")
> > > +(match_operand:VF 3 "register_operand")]
> > > +VCMLA)))]
> > > +  "TARGET_COMPLEX"
> > > +{
> > > +  emit_insn (gen_neon_vcmla (operands[0],
> > > operands[1],
> > > + operands[2],
> > > +operands[3]));
> > > +  emit_insn (gen_neon_vcmla (operands[0],
> > > operands[0],
> > > + operands[2],
> > > +operands[3]));
> > > +  DONE;
> > > +})
> > >
> > > What's the two halves?  Why hide this from the vectorizer if you go down 
> > > all
> > > to the detail and expose the rotation to it?
> > >
> >
> > The two halves are an implementation detail of the instruction in 
> > Armv8.3-a. As far as the
> > Vectorizer is concerned all you want to do, is an FMA rotating one of the 
> > operands by 0 or 180 degrees.
> >
> > Also note that the "rotations" in these instructions aren't exactly the 
> > same as what would fall under rotation of a complex number,
> > as each instruction can only do half of the final computation you want.
> >
> > In the ISA these instructions have to be used in a pair, where rotations 
> > determine
> > the operation you want to perform. E.g. a rotation of #0 followed by #90 
> > makes it a multiply and accumulate.
> >
> > A rotation of #180 followed by #90 makes this a vector complex subtract, 
> > which is intended to be used by the first call
> > using a register cleared with 0 (It becomes an "FMS" essentially if you 
> > don't clear the register).
> > Each "rotation" determine what operation is done and using which parts of 
> > the complex number. You change the
> > "rotations" and the grouping of the instructions to get different 
> > operations.
> >
> > I did not expose this to the vectorizer as It seems very ISA specific.
> >
> > > +;; The vcadd and vcmla patterns are made UNSPEC for the explicitly due
> > > +to the ;; fact that their usage need to guarantee that the source
> > > +vectors are ;; contiguous.  It would be wrong to describe the operation
> > > +without being able ;; to describe the permute that is also required,
> > > +but even if that is done ;; the permute would have been created as a
> > > +LOAD_LANES which means the values ;; in the registers are in the wrong
> > > order.
> > >
> > > Hmm, it's totally non-obvious to me how this relates to loads or what a 
> > > "non-
> > > contiguous"
> > > register would be?  That is, once you make this an unspec combine will 
> > > never
> > > be able to synthesize this from intrinsics code that doesn't use this 
> > > form.
> > >
> > > +(define_insn "neon_vcadd"
> > > +  [(set (match_operand:VF 0 "register_operand" "=w")
> > > +   (unspec:VF [(match_operand:VF 1 "register_operand" "w")
> > > +   (match_operand:VF 2 "register_operand" "w")]
> > > +   VCADD))]
> > >
> >
> > Yes that's my goal, as if operand1 and operand2 are loaded by instructions 
> > that
> > would have permuted the values in the register then the instruction doesn't 
> > work.
> >
> > The instruction does the permute itself, so it expects the values to have 
> > been loaded
> > using a simple load and not a LOAD_LANES. So I am intended to prevent 
> > combine from
> > recognizing the operation for that reason.
>
> But LOAD_LANES is used differently and the ISA probably doesn't really care 
> how
> you set up the register inputs.  You of course have to put in the
> correct values but
> how they get there doesn't matter.  So I don't see how combine can
> mess things up here.
>
> >  For the ADD combine can be used but then you'd
> > have to match the load and store since you have to change these, for the 
> > rest you'll run far afoul
> > of combine's 5 instruction limit.
>
> Why do you need to change these?  You assume the vectorizer vectorizes using
> interleaving - yes, in that case all hope is lost.  I assume the
> vectorizer will end up
> doing SLP with the existing TWO_OPERATORS support

You might be bitten by the fact that you tuned the vectorizer to always prefer
load/store-lanes over SLP when there are permutations.  You could lift
that a bit allowing rotation/projection permutes as they occur with complex
arithmetic.

> , thus for complex subtraction
> you'll see (A and B being complex vectors)
>
>add = A + B;
>sub = A - B;
>resultAcomplex_minusB = vec_merge (add, 

Re: [Committed][AArch64] Fix PR62178 testcase failures

2018-11-15 Thread Wilco Dijkstra
Hi Segher,

> On Wed, Nov 14, 2018 at 12:37:05PM +, Wilco Dijkstra wrote:
>> +/* { dg-final { scan-assembler-not { dup } } } */
>> +/* { dg-final { scan-assembler-not { fmov } } } */
>
> { dup }   is the same as   " dup "  , that is, with spaces and all.
> I don't think you want that (there usually is a tab character before
> mnemonics, so \s would work better, or use \m and \M.  Or nothing,
> just delete the spaces, if you are sure nothing in the generated
> assembler code says "dup").

Thanks for spotting that! I removed the spaces (below). The syntax is
arcane to say the least but it seems {} has fewer bugs in regular expressions.
I can never figure out how many \ to use so it works exactly identically across
different regexp/tcl versions.

Wilco

--- gcc/testsuite/gcc.target/aarch64/pr62178.c  (revision 266178)
+++ gcc/testsuite/gcc.target/aarch64/pr62178.c  (working copy)
@@ -18,5 +18,5 @@
 
 /* { dg-final { scan-assembler "ldr\\tq\[0-9\]+, \\\[x\[0-9\]+\\\], \[0-9\]+" 
} } */
 /* { dg-final { scan-assembler "mla\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+" 
} } */
-/* { dg-final { scan-assembler-not { dup } } } */
-/* { dg-final { scan-assembler-not { fmov } } } */
+/* { dg-final { scan-assembler-not {dup} } } */
+/* { dg-final { scan-assembler-not {fmov} } } */


Re: [PATCH 8/9][GCC][Arm] Add autovectorization support for complex multiplication and addition

2018-11-15 Thread Richard Biener
On Wed, Nov 14, 2018 at 4:47 PM Tamar Christina  wrote:
>
> Hi Richard,
>
> > > Ok for trunk?
> >
> > +;; The complex mla operations always need to expand to two instructions.
> > +;; The first operation does half the computation and the second does
> > +the ;; remainder.  Because of this, expand early.
> > +(define_expand "fcmla4"
> > +  [(set (match_operand:VF 0 "register_operand")
> > +   (plus:VF (match_operand:VF 1 "register_operand")
> > +(unspec:VF [(match_operand:VF 2 "register_operand")
> > +(match_operand:VF 3 "register_operand")]
> > +VCMLA)))]
> > +  "TARGET_COMPLEX"
> > +{
> > +  emit_insn (gen_neon_vcmla (operands[0],
> > operands[1],
> > + operands[2],
> > +operands[3]));
> > +  emit_insn (gen_neon_vcmla (operands[0],
> > operands[0],
> > + operands[2],
> > +operands[3]));
> > +  DONE;
> > +})
> >
> > What's the two halves?  Why hide this from the vectorizer if you go down all
> > to the detail and expose the rotation to it?
> >
>
> The two halves are an implementation detail of the instruction in Armv8.3-a. 
> As far as the
> Vectorizer is concerned all you want to do, is an FMA rotating one of the 
> operands by 0 or 180 degrees.
>
> Also note that the "rotations" in these instructions aren't exactly the same 
> as what would fall under rotation of a complex number,
> as each instruction can only do half of the final computation you want.
>
> In the ISA these instructions have to be used in a pair, where rotations 
> determine
> the operation you want to perform. E.g. a rotation of #0 followed by #90 
> makes it a multiply and accumulate.
>
> A rotation of #180 followed by #90 makes this a vector complex subtract, 
> which is intended to be used by the first call
> using a register cleared with 0 (It becomes an "FMS" essentially if you don't 
> clear the register).
> Each "rotation" determine what operation is done and using which parts of the 
> complex number. You change the
> "rotations" and the grouping of the instructions to get different operations.
>
> I did not expose this to the vectorizer as It seems very ISA specific.
>
> > +;; The vcadd and vcmla patterns are made UNSPEC for the explicitly due
> > +to the ;; fact that their usage need to guarantee that the source
> > +vectors are ;; contiguous.  It would be wrong to describe the operation
> > +without being able ;; to describe the permute that is also required,
> > +but even if that is done ;; the permute would have been created as a
> > +LOAD_LANES which means the values ;; in the registers are in the wrong
> > order.
> >
> > Hmm, it's totally non-obvious to me how this relates to loads or what a 
> > "non-
> > contiguous"
> > register would be?  That is, once you make this an unspec combine will never
> > be able to synthesize this from intrinsics code that doesn't use this form.
> >
> > +(define_insn "neon_vcadd"
> > +  [(set (match_operand:VF 0 "register_operand" "=w")
> > +   (unspec:VF [(match_operand:VF 1 "register_operand" "w")
> > +   (match_operand:VF 2 "register_operand" "w")]
> > +   VCADD))]
> >
>
> Yes that's my goal, as if operand1 and operand2 are loaded by instructions 
> that
> would have permuted the values in the register then the instruction doesn't 
> work.
>
> The instruction does the permute itself, so it expects the values to have 
> been loaded
> using a simple load and not a LOAD_LANES. So I am intended to prevent combine 
> from
> recognizing the operation for that reason.

But LOAD_LANES is used differently and the ISA probably doesn't really care how
you set up the register inputs.  You of course have to put in the
correct values but
how they get there doesn't matter.  So I don't see how combine can
mess things up here.

>  For the ADD combine can be used but then you'd
> have to match the load and store since you have to change these, for the rest 
> you'll run far afoul
> of combine's 5 instruction limit.

Why do you need to change these?  You assume the vectorizer vectorizes using
interleaving - yes, in that case all hope is lost.  I assume the
vectorizer will end up
doing SLP with the existing TWO_OPERATORS support, thus for complex subtraction
you'll see (A and B being complex vectors)

   add = A + B;
   sub = A - B;
   resultAcomplex_minusB = vec_merge (add, sub, 1)

basically the vectorizer will perform operations twice and then blend the two
results.  The add/sub + blend needs to be recognized by combine
(x86 does this for the vaddsub instructions which were designed to handle
complex subtraction and parts of the multiply).

For complex multiplication you'll see the pieces your ISA supports.

  mul1 = A * B
  mul2 = A * rot(B)  (rotation will be a shuffle)
  add = mul1 + mul2
  sub = mul1 - mul2
  result = blend (add, sub, ...)

as usual the combiner is helped by intermediate combiner 

Re: RFA: vectorizer patches 1/2 : WIDEN_MULT_PLUS support

2018-11-15 Thread Richard Biener
On Wed, Nov 14, 2018 at 4:21 PM Joern Wolfgang Rennecke
 wrote:
>
>
> On 14/11/18 09:53, Richard Biener wrote:
> >> WIDEN_MULT_PLUS is special on our target in that it creates double-sized
> >> vectors.
> > Are there really double-size vectors or does the target simply produce
> > the output in two vectors?  Usually targets have WIDEN_MULT_PLUS_LOW/HIGH
> > or _EVEN/ODD split operations.  Or, like - what I now remember - for the
> > DOT_PROD_EXPR optab, the target already reduces element pairs of the
> > result vector (unspecified which ones) so the result vector is of the same
> > size as the inputs.
> The output of widening multiply and widening multiply-add is stored
> in two consecutive registers.  So, they can be used as separate
> vectors, but you can't choose the register numbers indepenndently.
> OTOH, you can treat them together as a double-sized vector, but
> without any extra alignment requirements over a single-sized vector.
> >
> > That is, if your target produces two vectors you may instead want to hide
> > that fact by claiming you support DOT_PROD_EXPR and expanding
> > it to the widen-mult-plus plus reducing (adding) the two result vectors to
> > get a single one.
>
> Doing a part of the reduction in the loop is a bit pointless.

The advantage is you can work with lower vectorization factor and thus
less unrolling (smaller code).

> I have tried another approach, to advertize the WIDEN_MULT_PLUS
> and WIDEN_MULT operations as LO/HI part operations of the double
> vector size, and also add fake double-vector patterns for move, widening
> and add (they get expanded or splitted to single-vector patterns).
> That seems to work for the dot product, it's like the code is unrolled by
> a factor of two.

Yeah, that would work as well.  As said, the vectorizer cannot really
handle the doubled vector size (means: you sooner or later will run into
issues).

>  There are a few drawbacks though:
> - the tree optimizer creates separate WIDEN_MULT and PLUS expressions,
> and it is left to the combiner to clean that up.  That combination and
> register allocation
> might be a bit fragile.
> - if the input isn't known to be aligned to the doubled vector size, a
> run-time
> check is inserted to use an unvectorized loop if there is no excess
> alignment.
> - auto-increment for the loads is lost.  I can probably fix this by
> keeping double-sized
> loads around for longer or with some special-purpose pass, but both
> might have
> some other drawbacks.  But there's actually a configuration option for
> an instruction
> to load multiple vector registers with register-indirect or
> auto-increment, so there is
> some merit to have a pattern for it.
> - the code size is larger.
> - vectorization will fail if any other code is mixed in for which no
> double-vector patterns are provided.
> - this approach uses SUBREGs in ways that are not safe according to the
> documentation.
> But then, other ports like i386 and aarch64-little endian do that too.

I think they do it if they really have such instructions in the ISA
(they also have those that do the in-loop reduction to half of the result
vector size -- DOT_PROD_EXPR).

> I think it is now (since we have
> SUBREG_BYTE) safe to have subregs of registers with hard reg sizes
> larger than UNITS_PER_WORD,
> as long as you refer to entire hard registers.  Maybe we could change
> the documentation?
> AFAICT, there are also only four places that need to be patched to make
> a lowpart access with a SUBREG of such a hard register safe. I'm trying
> this at the moment, it was justa few hours late for the
> phase 1->3 deadline.
>
> I suppose for WIDEN_SUM_EXPR, I'd have to have one double-vector-sized
> pattern that
> adds the products of the two input vectors into the double output
> vector, and leave
> the rtl loop optimizer to get the constant pool load of the all-ones
> vector out of
> the loop.  But again, there'll be issues with excess alignment
> requirements and code size.

I think going the DOT_PROD_EXPR way is a lot easier.  You simply
expand the additional (in-loop) sum.  The only drawback I see is that
this might be slower code.

So yes the a _LO/_HI way maps better to hardware but you rely on
CSE to remove the redundant instruction if you implement _LO/_HI
as doing the full operation and just taking one of the result vectors.

> >
> > The vectorizer cannot really deal with multiple sizes, thus for example
> > a V4SI * V4SI + V4DI operation and that all those tree codes are exposed
> > as "scalar" is sth that continues to confuse me but is mainly done because
> > at pattern recognition time there's only the scalars.
> Well, the vectorizer makes an exception for reductions as it'll allow to
> maintain
> either a vector or a scalar during the loop, so why not allow other
> sizes for that
> value as well?

It's not implemented ;)

>  It's all hidden in the final reduction emitted by the
> epilogue.
> > For vectorization I would advise to provide expansion patterns 

Re: RFC (branch prediction): PATCH to implement P0479R5, [[likely]] and [[unlikely]].

2018-11-15 Thread Jan Hubicka
> > A warning seems appropriate.  You think the front end is the right
> > place for that?
> 
> Probably yes. Note that middle-end can optimize about dead branches and so 
> that
> theoretically one can end up with a branching where e.g. both branches are 
> [[likely]].
> I wouldn't bother users with these.

Note that what really happens in this case is that if conditional is
constant propagated and it has predict_expr, the predict_expr stays and
will get assigned to the random control dependence edge which controls
execution of the original statement.  This is not very intuitive
behaviour. Does C++ say what should happen in this case?
One option would be to deal with this gratefully at high level gimple
and turn predict_exprs into edge probabilities eariler than we do normal
branch prediction (which is intended to be later so profile does not end
up unnecesarily inconsistent)

Honza
> 
> Martin
> 
> > 
> > Jason
> > 
> 


Re: [PATCH 1/7][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf

2018-11-15 Thread Jozef Lawrynowicz
On Thu, 15 Nov 2018 10:36:57 +0100
Andreas Schwab  wrote:

> On Nov 14 2018, Jozef Lawrynowicz  wrote:
> 
> > The timeout as set in the dejagnu configuration for msp430
> > ([dejagnu.git]/baseboards/msp430-sim.exp) is 30, which is rarely
> > hit.  
> 
> I don't think it makes sense for a board file to set a smaller timeout
> than the default.
> 
> Andreas.
> 

Hmm, yes I see that some other DejaGNU board files increase the timeout for
GCC but no others are decreasing it.

I suspect this was just an oversight with the initial port of board file.

I'll go ahead and remove the timeout from the board file and
rerun the tests, with those dg-timeout directives also removed.

Thanks,
Jozef


[v3 PATCH] PR libstdc++/87855

2018-11-15 Thread Ville Voutilainen
Tested on Linux-PPC64. Ok for trunk? Backports?

2018-11-15  Ville Voutilainen  

PR libstdc++/87855
Also implement P0602R4 (variant and optional
should propagate copy/move triviality) for std::optional.
* include/std/optional (_Optional_payload): Change
the main constraints to check constructibility in
addition to assignability.
(operator=): Make constexpr.
(_M_reset): Likewise.
(_M_construct): Likewise.
(operator->): Likewise.
* testsuite/20_util/optional/assignment/8.cc: Adjust.
* testsuite/20_util/optional/assignment/9.cc: New.
diff --git a/libstdc++-v3/include/std/optional b/libstdc++-v3/include/std/optional
index d0257c0..fefd9a3 100644
--- a/libstdc++-v3/include/std/optional
+++ b/libstdc++-v3/include/std/optional
@@ -103,10 +103,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template ,
-	bool /*_HasTrivialCopyAssignment*/ =
-	  is_trivially_copy_assignable_v<_Tp>,
-	bool /*_HasTrivialMoveAssignment*/ =
-	  is_trivially_move_assignable_v<_Tp>>
+	bool /*_HasTrivialCopy */ =
+	  is_trivially_copy_assignable_v<_Tp>
+	  && is_trivially_copy_constructible_v<_Tp>,
+	bool /*_HasTrivialMove */ =
+	  is_trivially_move_assignable_v<_Tp>
+	  && is_trivially_move_constructible_v<_Tp>>
 struct _Optional_payload
 {
   constexpr _Optional_payload() noexcept : _M_empty() { }
@@ -148,6 +150,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  this->_M_construct(std::move(__other._M_payload));
   }
 
+  constexpr
   _Optional_payload&
   operator=(const _Optional_payload& __other)
   {
@@ -163,6 +166,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	return *this;
   }
 
+  constexpr
   _Optional_payload&
   operator=(_Optional_payload&& __other)
   noexcept(__and_v,
@@ -216,6 +220,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   { return this->_M_payload; }
 
   // _M_reset is a 'safe' operation with no precondition.
+  constexpr
   void
   _M_reset() noexcept
   {
@@ -346,6 +351,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _Optional_payload(const _Optional_payload&) = default;
   _Optional_payload(_Optional_payload&&) = default;
 
+  constexpr
   _Optional_payload&
   operator=(const _Optional_payload& __other)
   {
@@ -394,6 +400,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   { return this->_M_payload; }
 
   // _M_reset is a 'safe' operation with no precondition.
+  constexpr
   void
   _M_reset() noexcept
   {
@@ -466,6 +473,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _Optional_payload&
   operator=(const _Optional_payload& __other) = default;
 
+  constexpr
   _Optional_payload&
   operator=(_Optional_payload&& __other)
   noexcept(__and_v,
@@ -513,6 +521,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   { return this->_M_payload; }
 
   // _M_reset is a 'safe' operation with no precondition.
+  constexpr
   void
   _M_reset() noexcept
   {
@@ -581,6 +590,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _Optional_payload(const _Optional_payload&) = default;
   _Optional_payload(_Optional_payload&&) = default;
 
+  constexpr
   _Optional_payload&
   operator=(const _Optional_payload& __other)
   {
@@ -596,6 +606,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	return *this;
   }
 
+  constexpr
   _Optional_payload&
   operator=(_Optional_payload&& __other)
   noexcept(__and_v,
@@ -624,6 +635,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   bool _M_engaged;
 
   template
+constexpr
 void
 _M_construct(_Args&&... __args)
 noexcept(is_nothrow_constructible_v<_Stored_type, _Args...>)
@@ -643,6 +655,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   { return this->_M_payload; }
 
   // _M_reset is a 'safe' operation with no precondition.
+  constexpr
   void
   _M_reset() noexcept
   {
@@ -681,6 +694,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
   
   // _M_reset is a 'safe' operation with no precondition.
+  constexpr
   void
   _M_reset() noexcept
   {
@@ -1217,6 +1231,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   operator->() const
   { return std::__addressof(this->_M_get()); }
 
+  constexpr
   _Tp*
   operator->()
   { return std::__addressof(this->_M_get()); }
diff --git a/libstdc++-v3/testsuite/20_util/optional/assignment/8.cc b/libstdc++-v3/testsuite/20_util/optional/assignment/8.cc
index d573137..7241b92 100644
--- a/libstdc++-v3/testsuite/20_util/optional/assignment/8.cc
+++ b/libstdc++-v3/testsuite/20_util/optional/assignment/8.cc
@@ -91,6 +91,24 @@ struct S2 {
 };
 static_assert(std::is_trivially_move_assignable_v);
 
+struct S3 {
+  S3(const S3&);
+  S3& operator=(const S3&) = default;
+};
+static_assert(std::is_trivially_copy_assignable_v);
+static_assert(std::is_copy_assignable_v);
+static_assert(!std::is_trivially_copy_assignable_v>);
+static_assert(std::is_copy_assignable_v>);
+

Re: [PATCH 5/6] ifcvt: Only created temporaries as needed.

2018-11-15 Thread Robin Dapp
> This looks pretty reasonable.  ISTM it ought to be able to go forward if
> it's tested independently.

The test suite already passes, any other tests you have in mind?  To be
honest I suppose noce_convert_multiple_sets will currently never
successfully return (due to the costing problems I described before), so
a specific test case might be difficult.

Regards
 Robin



Re: [PATCH 2/6] ifcvt: Allow constants operands in noce_convert_multiple_sets.

2018-11-15 Thread Robin Dapp
> This may ultimately be too simplistic.  There are targets where some
> constants are OK, but others may not be.   By checking the predicate
> like this I think you can cause over-aggressive if-conversion if the
> target allows a range of integers in the expander's operand predicate,
> but allows a narrower range in the actual define_insn (presumably the
> expander loads them into a pseudo or somesuch in that case).
> 
> We know that over-aggressive if-conversion into conditional moves hurts
> some targets.
> 
> Ideally you'd create the actual insn with the constants you want to use
> and see if that's recognized as well as compute its cost.  Is that just
> too painful at this point for some reason?

Conditional moves in noce_convert_multiple_sets are processed via
noce_emit_cmove which itself performs quite some preprocessing and calls
optab's emit_conditional_move in the end, so the insn will already be
emitted before being able to decide whether to decline it due to costs.
In addition, every noce_emit_cmove will emit the condition check again
as well as possible temporaries.

Comparing the costs of the whole sequence will therefore still prove
difficult as all the additionally generated insns will not get removed
until reload and make a fair comparison nigh impossible.

I was reluctant to factor out all the preprocessing stuff, separate it
from the condition check and actual emitting part but that's properly
the "right way" to do it, including emitting the condition only once in
the beginning.

However, for now, I could imagine changing only the
conversion_profitable_p hook in our backend to only count the cmovs and
ignore everything else in the sequence.  This would be somewhat hacky
though and still wouldn't allow constants :)

Regards
 Robin



Re: RFC (branch prediction): PATCH to implement P0479R5, [[likely]] and [[unlikely]].

2018-11-15 Thread Martin Liška
On 11/13/18 8:42 PM, Jason Merrill wrote:
> On Tue, Nov 13, 2018 at 9:20 AM Martin Liška  wrote:
>>
>> On 11/13/18 5:43 AM, Jason Merrill wrote:
>>> [[likely]] and [[unlikely]] are equivalent to the GNU hot/cold attributes,
>>> except that they can be applied to arbitrary statements as well as labels;
>>> this is most likely to be useful for marking if/else branches as likely or
>>> unlikely.  Conveniently, PREDICT_EXPR fits the bill nicely as a
>>> representation.
>>>
>>> I also had to fix marking case labels as hot/cold, which didn't work before.
>>> Which then required me to force __attribute ((fallthrough)) to apply to the
>>> statement rather than the label.
>>>
>>> Tested x86_64-pc-linux-gnu.  Does this seem like a sane implementation
>>> approach to people with more experience with PREDICT_EXPR?
>>
>> Hi.
>>
>> In general it makes sense to implement it the same way. Question is how much
>> should the hold/cold attribute should be close to __builtin_expect.
>>
>> Let me present few examples and differences that I see:
>>
>> 1) ./xgcc -B. -O2 -fdump-tree-profile_estimate=/dev/stdout /tmp/test1.C
>>
>> ;; Function foo (_Z3foov, funcdef_no=0, decl_uid=2301, cgraph_uid=1, 
>> symbol_order=3)
>>
>> Predictions for bb 2
>>   first match heuristics: 90.00%
>>   combined heuristics: 90.00%
>>   __builtin_expect heuristics of edge 2->3: 90.00%
>>
>> As seen here __builtin_expect is stronger as it's first match heuristics and 
>> has probability == 90%.
>>
>> ;; Function bar (_Z3barv, funcdef_no=1, decl_uid=2303, cgraph_uid=2, 
>> symbol_order=4)
>>
>> Predictions for bb 2
>>   DS theory heuristics: 74.48%
>>   combined heuristics: 74.48%
>>   opcode values nonequal (on trees) heuristics of edge 2->3: 34.00%
>>   hot label heuristics of edge 2->3: 85.00%
>>
>> Here we combine hot label prediction with the opcode one, resulting in quite 
>> poor result 75%.
>> So maybe cold/hot prediction cal also happen first match.
> 
> Makes sense.

Good, let me prepare patch for it.

> 
>> 2) ./xgcc -B. -O2 -fdump-tree-profile_estimate=/dev/stdout /tmp/test2.C
>> ...
>> foo ()
>> {
>> ...
>>   switch (_3)  [3.33%], case 3:  [3.33%], case 42:  
>> [3.33%], case 333:  [90.00%]>
>>
>> while:
>>
>> bar ()
>> {
>>   switch (a.1_1)  [25.00%], case 3:  [25.00%], case 42: 
>>  [25.00%], case 333:  [25.00%]>
>> ...
>>
>> Note that support for __builtin_expect was enhanced in this stage1. I can 
>> definitely cover also situations when one uses
>> hot/cold for labels. So definitely place for improvement.
> 
> Hmm, the gimplifier should be adding a PREDICT_EXPR for the case label
> now, is it just ignored currently?

Yes, for switch multi-branches yes. I'll prepare patch for it, should be quite 
straightforward.

> 
>> 3) last example where one can use the attribute for function decl, resulting 
>> in:
>> __attribute__((hot, noinline))
>> foo ()
>> {
>> ..
>>
>> Hope it's desired? If so I would cover that with a test-case in test-suite.
> 
> [[likely]] and [[unlikely]] don't apply to functions; I suppose I
> should diagnose that.

Yes.

> 
>> Jason can you please point to C++ specification of the attributes?
> 
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0479r5.html
> 
>> Would you please consider an error diagnostics for situations written in 
>> test4.C?
> 
> A warning seems appropriate.  You think the front end is the right
> place for that?

Probably yes. Note that middle-end can optimize about dead branches and so that
theoretically one can end up with a branching where e.g. both branches are 
[[likely]].
I wouldn't bother users with these.

Martin

> 
> Jason
> 



[PATCH] Fix PR88029, change how gimple_call_flags works

2018-11-15 Thread Richard Biener


The following makes sure we preserve a 'const' attribute that
was in effect on an indirect call even after turning it into
a direct call of a non-const (pure for the testcase) function.
This mimics how we handle nothrow/noreturn.

By unioning ECF_ flags from fntype and decl we get the best
of both worlds (for optimization as well as IL consistency).
This seemed easier (and better) than introducing pure/const
flags on the gimple stmt (as opposed to nothrow which can
come in via -fno-exceptions on a function as well).

I skimmed over flags_from_decl_or_type users to see which
might need updating to simply use gimple_call_flags but
only found bitrotting TM code (which I updated).

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2018-11-15  Richard Biener  

PR middle-end/
* gimple.c (gimple_call_flags): Union flags from decl, type
and call fntype.
* trans-mem.c (is_tm_pure_call): Simplify.

* gcc.dg/tree-ssa/pr88029.c: New testcase.

diff --git a/gcc/gimple.c b/gcc/gimple.c
index 41d9f677c4f..23ccbae6bcb 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -1446,15 +1446,17 @@ gimple_call_same_target_p (const gimple *c1, const 
gimple *c2)
 int
 gimple_call_flags (const gimple *stmt)
 {
-  int flags;
-  tree decl = gimple_call_fndecl (stmt);
+  int flags = 0;
 
-  if (decl)
-flags = flags_from_decl_or_type (decl);
-  else if (gimple_call_internal_p (stmt))
+  if (gimple_call_internal_p (stmt))
 flags = internal_fn_flags (gimple_call_internal_fn (stmt));
   else
-flags = flags_from_decl_or_type (gimple_call_fntype (stmt));
+{
+  tree decl = gimple_call_fndecl (stmt);
+  if (decl)
+   flags = flags_from_decl_or_type (decl);
+  flags |= flags_from_decl_or_type (gimple_call_fntype (stmt));
+}
 
   if (stmt->subcode & GF_CALL_NOTHROW)
 flags |= ECF_NOTHROW;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr88029.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr88029.c
new file mode 100644
index 000..c169dd543cd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88029.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O -w -fdump-tree-fre1-vops" } */
+
+double foo (double) __attribute__ ((pure));
+double (*fp) (double) __attribute__ ((const));
+double f(double a)
+{
+  fp = foo;
+  /* Verify when propagating foo to the call we preserve its constness.  */
+  return fp (a);
+}
+
+/* { dg-final { scan-tree-dump "foo \\(a" "fre1" } } */
+/* { dg-final { scan-tree-dump-times "VUSE" 1 "fre1" } } */
diff --git a/gcc/trans-mem.c b/gcc/trans-mem.c
index 938f4e28b90..bb7146bc153 100644
--- a/gcc/trans-mem.c
+++ b/gcc/trans-mem.c
@@ -265,20 +265,7 @@ is_tm_safe (const_tree x)
 static bool
 is_tm_pure_call (gimple *call)
 {
-  if (gimple_call_internal_p (call))
-return (gimple_call_flags (call) & (ECF_CONST | ECF_TM_PURE)) != 0;
-
-  tree fn = gimple_call_fn (call);
-
-  if (TREE_CODE (fn) == ADDR_EXPR)
-{
-  fn = TREE_OPERAND (fn, 0);
-  gcc_assert (TREE_CODE (fn) == FUNCTION_DECL);
-}
-  else
-fn = TREE_TYPE (fn);
-
-  return is_tm_pure (fn);
+  return (gimple_call_flags (call) & (ECF_CONST | ECF_TM_PURE)) != 0;
 }
 
 /* Return true if X has been marked TM_CALLABLE.  */


[PATCH] Fix PR88030

2018-11-15 Thread Richard Biener


The following fixes PR88030.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2018-11-15  Richard Biener  

PR tree-optimization/88030
* tree-complex.c (need_eh_cleanup): New global.
(update_complex_assignment): Mark blocks that need EH update.
(expand_complex_comparison): Likewise.
(tree_lower_complex): Allocate and deallocate need_eh_cleanup,
perform EH cleanup and schedule CFG cleanup if that did anything.

* gcc.dg/tsan/pr88030.c: New testcase.

>From 1021924614c39cee9e4241d7f4318100a1590267 Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Thu, 15 Nov 2018 09:57:49 +0100
Subject: [PATCH] fix-pr88030


diff --git a/gcc/testsuite/gcc.dg/tsan/pr88030.c 
b/gcc/testsuite/gcc.dg/tsan/pr88030.c
new file mode 100644
index 000..0c94c7c53f9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tsan/pr88030.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=thread -fnon-call-exceptions -fexceptions" } */
+
+typedef __complex__ float Value;
+typedef struct {
+  Value a[16 / sizeof (Value)];
+} A;
+
+A sum(A a,A b)
+{
+  a.a[0]+=b.a[0];
+  a.a[1]+=b.a[1];
+  return a;
+}
diff --git a/gcc/tree-complex.c b/gcc/tree-complex.c
index 4bf644f9473..2e104c9defb 100644
--- a/gcc/tree-complex.c
+++ b/gcc/tree-complex.c
@@ -80,6 +80,9 @@ static vec complex_ssa_name_components;
non-SSA_NAME/non-invariant args that need to be replaced by SSA_NAMEs.  */
 static vec phis_to_revisit;
 
+/* BBs that need EH cleanup.  */
+static bitmap need_eh_cleanup;
+
 /* Lookup UID in the complex_variable_components hashtable and return the
associated tree.  */
 static tree
@@ -701,7 +704,7 @@ update_complex_assignment (gimple_stmt_iterator *gsi, tree 
r, tree i)
   stmt = gsi_stmt (*gsi);
   update_stmt (stmt);
   if (maybe_clean_eh_stmt (stmt))
-gimple_purge_dead_eh_edges (gimple_bb (stmt));
+bitmap_set_bit (need_eh_cleanup, gimple_bb (stmt)->index);
 
   update_complex_components (gsi, gsi_stmt (*gsi), r, i);
 }
@@ -1559,7 +1562,7 @@ expand_complex_comparison (gimple_stmt_iterator *gsi, 
tree ar, tree ai,
 
   update_stmt (stmt);
   if (maybe_clean_eh_stmt (stmt))
-gimple_purge_dead_eh_edges (gimple_bb (stmt));
+bitmap_set_bit (need_eh_cleanup, gimple_bb (stmt)->index);
 }
 
 /* Expand inline asm that sets some complex SSA_NAMEs.  */
@@ -1773,6 +1776,8 @@ tree_lower_complex (void)
   class complex_propagate complex_propagate;
   complex_propagate.ssa_propagate ();
 
+  need_eh_cleanup = BITMAP_ALLOC (NULL);
+
   complex_variable_components = new int_tree_htab_type (10);
 
   complex_ssa_name_components.create (2 * num_ssa_names);
@@ -1818,11 +1823,15 @@ tree_lower_complex (void)
 
   gsi_commit_edge_inserts ();
 
+  unsigned todo
+= gimple_purge_all_dead_eh_edges (need_eh_cleanup) ? TODO_cleanup_cfg : 0;
+  BITMAP_FREE (need_eh_cleanup);
+
   delete complex_variable_components;
   complex_variable_components = NULL;
   complex_ssa_name_components.release ();
   complex_lattice_values.release ();
-  return 0;
+  return todo;
 }
 
 namespace {


Re: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87626

2018-11-15 Thread Richard Biener
On Thu, Nov 15, 2018 at 10:02 AM Umesh Kalappa  wrote:
>
> Hi All,
>
> The attached patch (pr85667.patch) fixes the subjected issue .
> we tested on x86_64(linux and windows both) and no regress found .
>
> ok to commit ?

I wonder if you can turn the testcase into a dg-run one, making the
functions noinline/noipa and check the correct values are returned.

Richard.

> Thank you
> ~Umesh


Re: [PATCH, csky] Update dynamic linker'name

2018-11-15 Thread Richard Biener
On Thu, Nov 15, 2018 at 7:02 AM 瞿仙淼  wrote:
>
> Hi,
> I have submitted a patch to update dynamic linker'name
>
>
> Index: gcc/ChangeLog
> ===
> --- gcc/ChangeLog   (revision 266171)
> +++ gcc/ChangeLog   (working copy)
> @@ -1,3 +1,9 @@
> +2018-11-15  Xianmiao Qu  
> +
> +   * config/csky/csky-linux-elf.h (LINUX_DYNAMIC_LINKER): Remove.
> +   (GLIBC_DYNAMIC_LINKER): Define.
> +   (LINUX_TARGET_LINK_SPEC): Update the dynamic linker's name.
> +
>  2018-11-15  Bin Cheng  
>
> PR tree-optimization/84648
> Index: gcc/config/csky/csky-linux-elf.h
> ===
> --- gcc/config/csky/csky-linux-elf.h(revision 266171)
> +++ gcc/config/csky/csky-linux-elf.h(working copy)
> @@ -61,7 +61,7 @@
>%{mvdsp:-mvdsp}  \
>"
>
> -#define LINUX_DYNAMIC_LINKER  "/lib/ld.so.1"
> +#define GLIBC_DYNAMIC_LINKER 
> "/lib/ld-linux-cskyv2%{mhard-float:-hf}%{mbig-endian:-be}.so.1"
>
>  #define LINUX_TARGET_LINK_SPEC "%{h*} %{version:-v}\
> %{b}\
> @@ -70,7 +70,7 @@
> %{symbolic:-Bsymbolic}  \
> %{!static:  \
>   %{rdynamic:-export-dynamic}   \
> - %{!shared:-dynamic-linker " LINUX_DYNAMIC_LINKER "}}  \
> + %{!shared:-dynamic-linker " GNU_USER_DYNAMIC_LINKER "}}   \

^^^ GNU_USER_DYNAMIC_LINKER vs. GLIBC_DYNAMIC_LINKER

> -X  \
> %{mbig-endian:-EB} %{mlittle-endian:-EL}\
> %{EB:-EB} %{EL:-EL}"
> Index: libgcc/ChangeLog
> ===
> --- libgcc/ChangeLog(revision 266171)
> +++ libgcc/ChangeLog(working copy)
> @@ -1,3 +1,7 @@
> +2018-11-15  Xianmiao Qu  
> +
> +   * config/csky/linux-unwind.h: Fix coding style.
> +
>  2018-11-13  Xianmiao Qu  
>
> * config/csky/linux-unwind.h (_sig_ucontext_t): Remove.
> Index: libgcc/config/csky/linux-unwind.h
> ===
> --- libgcc/config/csky/linux-unwind.h   (revision 266171)
> +++ libgcc/config/csky/linux-unwind.h   (working copy)
> @@ -25,10 +25,8 @@
>
>  #ifndef inhibit_libc
>
> -/*
> - * Do code reading to identify a signal frame, and set the frame state data
> - * appropriately.  See unwind-dw2.c for the structs.
> - */
> +/* Do code reading to identify a signal frame, and set the frame state data
> +   appropriately.  See unwind-dw2.c for the structs.  */
>
>  #include 
>  #include 
>
>


Re: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85667

2018-11-15 Thread Umesh Kalappa
Edited the subjected for the proper PR no.
~Umesh

On Thu, Nov 15, 2018 at 2:32 PM Umesh Kalappa  wrote:
>
> Hi All,
>
> The attached patch (pr85667.patch) fixes the subjected issue .
> we tested on x86_64(linux and windows both) and no regress found .
>
> ok to commit ?
>
> Thank you
> ~Umesh


Re: [PATCH] avoid -Wnonnull for printf format in dead code (PR 87041)

2018-11-15 Thread Christophe Lyon
On Thu, 15 Nov 2018 at 10:39, Matthew Malcomson
 wrote:
>
> On 02/11/18 09:54, Christophe Lyon wrote:
> > Hi,
> >
> > I've noticed failure on targets using newlib (aarch64-elf and arm-eabi):
> > FAIL: gcc.c-torture/execute/printf-2.c
> > FAIL: gcc.c-torture/execute/user-printf.c
> >
> > my gcc.log contains:
> > gcc.c-torture/execute/user-printf.c   -O0  execution test (reason: TCL
> > LOOKUP CHANNEL exp5)
> > which is not very helpful
> >
>
> @Christophe
> Can I ask if your DejaGNU board setup has "needs_status_wrapper 1" for
> both of these boards?
>
Yes, I do use this.

> I believe this problem is caused by an interaction with the DejaGNU
> status wrapper.
>
> When the status wrapper is needed, DejaGNU looks at stdout for a line
> saying "*** EXIT code " indicating what the status code was.
> When it doesn't find that line it assumes an exit code of 2.
> Without the status wrapper DejaGNU takes the return code from the
> program executed.
>
> Because these tests use "freopen ()" on stdout, the status wrapper fails
> to print to the IO channel DejaGNU is listening on, hence DejaGNU fails
> to find it's line, uses an exit code of 2, and fails the test.
>
>
> @Martin
> Would these tests still be valid having run freopen on stderr and using
> fprintf instead of printf?
> That makes the testcases pass for me.
>
> If not we could add an
> { dg-require-effective-target unwrapped }
> directive in the testcases to stop the failure complaints.


Re: [PATCH] diagnose built-in declarations without prototype (PR 83656)

2018-11-15 Thread Richard Biener
On Wed, Nov 14, 2018 at 7:08 PM Jeff Law  wrote:
>
> On 11/6/18 4:56 PM, Martin Sebor wrote:
> > In response to Joseph's comment I've removed the interaction
> > with -Wpedantic from the updated patch.
> >
> > In addition, to help detect bugs like the one in the test case
> > for pr87886, I have also enhanced the detection of incompatible
> > calls to include integer/real type mismatches so that calls like
> > the one below are diagnosed:
> >
> >   extern double sqrt ();
> >   int f (int x)
> >   {
> > return sqrt (x);   // passing int where double is expected
> >   }
> >
> > With the removal of the -Wpedantic interaction declaring abort()
> > without a prototype is no longer diagnosed and so the test suite
> > changes to add the prototype are not necessary.  I decided not
> > to back them out because Jeff indicated a preference for making
> > these kinds of improvements in general in an unrelated
> > discussion.
> >
> >
> >
> > gcc-83656.diff
> >
> > PR c/83656 - missing -Wbuiltin-declaration-mismatch on declaration without 
> > prototype
> >
> > gcc/c/ChangeLog:
> >
> >   PR c/83656
> >   * c-decl.c (header_for_builtin_fn): Declare.
> >   (diagnose_mismatched_decls): Diagnose declarations of built-in
> >   functions without a prototype.
> >   * c-typeck.c (maybe_warn_builtin_no_proto_arg): New function.
> >   (convert_argument): Same.
> >   (convert_arguments): Factor code out into convert_argument.
> >   Detect mismatches between built-in formal arguments in calls
> >   to built-in without prototype.
> >   (build_conditional_expr): Same.
> >   (type_or_builtin_type): New function.
> >   (convert_for_assignment): Add argument.  Conditionally issue
> >   warnings instead of errors for mismatches.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   PR c/83656
> >   * gcc.dg/20021006-1.c
> >   * gcc.dg/Wbuiltin-declaration-mismatch.c: New test.
> >   * gcc.dg/Wbuiltin-declaration-mismatch-2.c: New test.
> >   * gcc.dg/Wbuiltin-declaration-mismatch-3.c: New test.
> >   * gcc.dg/Wbuiltin-declaration-mismatch-4.c: New test.
> >   * gcc.dg/Walloca-16.c: Adjust.
> >   * gcc.dg/Wrestrict-4.c: Adjust.
> >   * gcc.dg/Wrestrict-5.c: Adjust.
> >   * gcc.dg/atomic/stdatomic-generic.c: Adjust.
> >   * gcc.dg/atomic/stdatomic-lockfree.c: Adjust.
> >   * gcc.dg/initpri1.c: Adjust.
> >   * gcc.dg/pr15698-1.c: Adjust.
> >   * gcc.dg/pr69156.c: Adjust.
> >   * gcc.dg/pr83463.c: Adjust.
> >   * gcc.dg/redecl-4.c: Adjust.
> >   * gcc.dg/tls/thr-init-2.c: Adjust.
> >   * gcc.dg/torture/pr55890-2.c: Adjust.
> >   * gcc.dg/torture/pr55890-3.c: Adjust.
> >   * gcc.dg/torture/pr67741.c: Adjust.
> >   * gcc.dg/torture/stackalign/sibcall-1.c: Adjust.
> >   * gcc.dg/torture/tls/thr-init-1.c: Adjust.
> >   * gcc.dg/tree-ssa/builtins-folding-gimple-ub.c: Adjust.
> >
>
>
> > @@ -3547,8 +3598,24 @@ convert_arguments (location_t loc, vec 
> > arg_loc, tree typelist,
> >if (parmval == error_mark_node)
> >   error_args = true;
> >
> > +  if (!type && builtin_type && TREE_CODE (builtin_type) != VOID_TYPE)
> > + {
> > +   /* For a call to a built-in function declared without a prototype,
> > +  perform the coversions from the argument to the expected type
> > +  but issue warnings rather than errors for any mismatches.
> > +  Ignore the converted argument and use the PARMVAL obtained
> > +  above by applying default coversions instead.  */
> s/coversions/conversions/
>
> Two of 'em in that comment.  OK with that nit fixed.

I once had -Wunprototyped-calls...  (attached).  Because the issue doesn't
exist only for builtins... (originally inspired by a Xorg bug)

Richard.

> jeff
>
>
Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt.orig	2014-11-11 09:38:31.286867405 +0100
+++ gcc/c-family/c.opt	2014-11-17 16:38:44.438078729 +0100
@@ -868,6 +868,10 @@ Wunused-local-typedefs
 C ObjC C++ ObjC++ Var(warn_unused_local_typedefs) Warning EnabledBy(Wunused)
 Warn when typedefs locally defined in a function are not used
 
+Wunprototyped-calls
+C Var(warn_unprototyped_calls) Warning LangEnabledBy(C,Wall)
+Warn about calls to unprototyped functions with at least one argument
+
 Wunused-macros
 C ObjC C++ ObjC++ CppReason(CPP_W_UNUSED_MACROS) Var(cpp_warn_unused_macros) Warning
 Warn about macros defined in the main file that are not used
Index: gcc/c/c-typeck.c
===
--- gcc/c/c-typeck.c.orig	2014-11-11 09:38:32.068867371 +0100
+++ gcc/c/c-typeck.c	2014-11-17 16:38:44.442078729 +0100
@@ -2957,6 +2957,19 @@ build_function_call_vec (location_t loc,
   && !check_builtin_function_arguments (fundecl, nargs, argarray))
 return error_mark_node;
 
+  /* If we cannot check function arguments because a prototype is
+ missing for the 

Re: Don't use %z printf format length specified

2018-11-15 Thread Richard Biener
On Wed, Nov 14, 2018 at 6:24 PM Michael Matz  wrote:
>
> Hi,
>
> On Wed, 14 Nov 2018, Alexander Monakov wrote:
>
> > On Wed, 14 Nov 2018, Michael Matz wrote:
> >
> > > Hi,
> > >
> > > it's not c++98 conforming and I get 1 million warnings when compiling.
> > > Initially I had also casts to long at the various arguments, but I get no
> > > warning with this variant either, so I removed them again.  I'm currently
> > > regstrapping this, okay for trunk if successfull?
> >
> > Surely this will break mingw-w64 where size_t is 64-bit yet long is 32-bit?
>
> Okay, probably.  Then consider the same patch with sprinkling casts to
> long for all these arguments (the actual numbers will fit that type in
> reality).  I could of course also use PRIu64 but that makes my eyes bleed.

Hmm.  Can you use PRIu64 please and cast to uint64_t?  OK with that
change.

Thanks,
Richard.

>
> Ciao,
> Michael.


Re: [PATCH][rs6000] inline expansion of memcmp using vsx

2018-11-15 Thread Richard Biener
On Wed, Nov 14, 2018 at 5:43 PM Aaron Sawdey  wrote:
>
> This patch generalizes some the functions added earlier to do vsx expansion 
> of strncmp
> so that the can also generate the code needed for memcmp. I reorganized
> expand_block_compare() a little to be able to make use of this there. The vsx 
> code is more
> compact so I've changed the default block compare inline limit to 63 bytes. 
> The vsx
> code is only used if there is at least 16 bytes to compare as this means we 
> don't have to
> do complex code to compare less than one chunk. If vsx is not available the 
> limit is cut
> in half. The performance is good, vsx memcmp is considerably faster than the 
> gpr inline code
> if the strings are equal and is comparable if the strings have a 10% chance 
> of being
> equal (spread across the string).

How is performance affected if there are close earlier char-size
stores to one of the string/memory?
Can power still do store forwarding in this case?

> Currently regtesting, ok for trunk if tests pass?
>
> Thanks!
>Aaron
>
> 2018-11-14  Aaron Sawdey  
>
> * config/rs6000/rs6000-string.c (emit_vsx_zero_reg): New function.
> (expand_cmp_vec_sequence): Rename and modify
> expand_strncmp_vec_sequence.
> (emit_final_compare_vec): Rename and modify 
> emit_final_str_compare_vec.
> (generate_6432_conversion): New function.
> (expand_block_compare): Add support for vsx.
> (expand_block_compare_gpr): New function.
> * config/rs6000/rs6000.opt (rs6000_block_compare_inline_limit): 
> Increase
> default limit to 63 because of more compact vsx code.
>
>
>
>
> Index: gcc/config/rs6000/rs6000-string.c
> ===
> --- gcc/config/rs6000/rs6000-string.c   (revision 266034)
> +++ gcc/config/rs6000/rs6000-string.c   (working copy)
> @@ -615,6 +615,283 @@
>  }
>  }
>
> +static rtx
> +emit_vsx_zero_reg()
> +{
> +  unsigned int i;
> +  rtx zr[16];
> +  for (i = 0; i < 16; i++)
> +zr[i] = GEN_INT (0);
> +  rtvec zv = gen_rtvec_v (16, zr);
> +  rtx zero_reg = gen_reg_rtx (V16QImode);
> +  rs6000_expand_vector_init (zero_reg, gen_rtx_PARALLEL (V16QImode, zv));
> +  return zero_reg;
> +}
> +
> +/* Generate the sequence of compares for strcmp/strncmp using vec/vsx
> +   instructions.
> +
> +   BYTES_TO_COMPARE is the number of bytes to be compared.
> +   ORIG_SRC1 is the unmodified rtx for the first string.
> +   ORIG_SRC2 is the unmodified rtx for the second string.
> +   S1ADDR is the register to use for the base address of the first string.
> +   S2ADDR is the register to use for the base address of the second string.
> +   OFF_REG is the register to use for the string offset for loads.
> +   S1DATA is the register for loading the first string.
> +   S2DATA is the register for loading the second string.
> +   VEC_RESULT is the rtx for the vector result indicating the byte 
> difference.
> +   EQUALITY_COMPARE_REST is a flag to indicate we need to make a cleanup call
> +   to strcmp/strncmp if we have equality at the end of the inline comparison.
> +   P_CLEANUP_LABEL is a pointer to rtx for a label we generate if we need 
> code
> +   to clean up and generate the final comparison result.
> +   FINAL_MOVE_LABEL is rtx for a label we can branch to when we can just
> +   set the final result.
> +   CHECKZERO indicates whether the sequence should check for zero bytes
> +   for use doing strncmp, or not (for use doing memcmp).  */
> +static void
> +expand_cmp_vec_sequence (unsigned HOST_WIDE_INT bytes_to_compare,
> +rtx orig_src1, rtx orig_src2,
> +rtx s1addr, rtx s2addr, rtx off_reg,
> +rtx s1data, rtx s2data, rtx vec_result,
> +bool equality_compare_rest, rtx *p_cleanup_label,
> +rtx final_move_label, bool checkzero)
> +{
> +  machine_mode load_mode;
> +  unsigned int load_mode_size;
> +  unsigned HOST_WIDE_INT cmp_bytes = 0;
> +  unsigned HOST_WIDE_INT offset = 0;
> +  rtx zero_reg = NULL;
> +
> +  gcc_assert (p_cleanup_label != NULL);
> +  rtx cleanup_label = *p_cleanup_label;
> +
> +  emit_move_insn (s1addr, force_reg (Pmode, XEXP (orig_src1, 0)));
> +  emit_move_insn (s2addr, force_reg (Pmode, XEXP (orig_src2, 0)));
> +
> +  if (checkzero && !TARGET_P9_VECTOR)
> +zero_reg = emit_vsx_zero_reg();
> +
> +  while (bytes_to_compare > 0)
> +{
> +  /* VEC/VSX compare sequence for P8:
> +check each 16B with:
> +lxvd2x 32,28,8
> +lxvd2x 33,29,8
> +vcmpequb 2,0,1  # compare strings
> +vcmpequb 4,0,3  # compare w/ 0
> +xxlorc 37,36,34   # first FF byte is either mismatch or end of 
> string
> +vcmpequb. 7,5,3  # reg 7 contains 0
> +bnl 6,.Lmismatch
> +
> +For the P8 LE case, we use lxvd2x and compare full 16 bytes
> +but then use use vgbbd and a shift to get two bytes with the
> + 

Re: [PATCH 21/25] GCN Back-end (part 2/2).

2018-11-15 Thread Andrew Stubbs

On 14/11/2018 22:30, Jeff Law wrote:

There's a particular case that has historically been problematical.

If you have this kind of sequence in the epilogue

restore register using FP
move fp->sp  (deallocates frame)
return

Under certain circumstances the scheduler can swap the register restore
and move from fp into sp creating something like this:

move fp->sp (deallocates frame)
restore register using FP (reads from deallocated frame)
return

That would normally be OK, except if you take an interrupt between the
first two instructions.  If interrupt handling is done without switching
stacks, then the interrupt handler may write into the just de-allocated
frame destroying the values that were saved in the prologue.


OK, so the barrier needs to be right before the stack pointer moves. I 
can do that. :-)


Presumably the same is true for prologues, except that the barrier needs 
to be after the stack adjustment.



You may not need to worry about that today on the GCN port, but you
really want to fix it now so that it's never a problem.  You *really*
don't want to have to debug this kind of problem in the wild.  Been
there, done that, more than once :(


I'm not exactly sure how interrupts work on this platform -- we've had 
no use for them yet -- but without a debugger, and with up to 1024 
threads running simultaneously, you can be sure I don't want to debug it!



I would hazard a guess that combine saw the one without the use as
"simpler" and preferred it.  I think you've made a bit of a fundamental
problem with the way the EXEC register is being handled.  Hopefully you
can get by with some magic UNSPEC wrappers without having to do too much
surgery.


Exactly so. An initial experiment with combine re-enabled has not shown 
any errors, so it's possible the problem has gone away, but I've not 
been over the full testsuite yet (and you wouldn't expect actual 
failures anyway).



In future, I'd like to have the scheduler insert real instructions into
these slots, but that's very much on the to-do list.

If you you can model this as a latency between the two points where you
need to insert the nops, then the scheduler will fill in what it can.
But it doesn't generally handle non-interlocked processors.   So you'll
still want your little pass to fix things up when the scheduler couldn't
find useful work to schedule into those bubbles.


Absolutely, the scheduler is about optimization and this md_reorg pass 
is about correctness.



I have no idea whether the architecture has those issues or not.

The guideline I would give to determine if you're vulnerable...  Do you
have speculation, including the ability to speculate past a memory
operation, branch prediction, memory caches and high resolution timer
(ie, like a cycle timer).  If you've got those, then the processor is
likely vulnerable to a spectre V1 style attack.  Those are the basic
building blocks.


We have cycle timers and caches, but I'll have to ask AMD about the 
other details.


Andrew


Re: [PATCH] avoid -Wnonnull for printf format in dead code (PR 87041)

2018-11-15 Thread Matthew Malcomson
On 02/11/18 09:54, Christophe Lyon wrote:
> Hi,
>
> I've noticed failure on targets using newlib (aarch64-elf and arm-eabi):
> FAIL: gcc.c-torture/execute/printf-2.c
> FAIL: gcc.c-torture/execute/user-printf.c
>
> my gcc.log contains:
> gcc.c-torture/execute/user-printf.c   -O0  execution test (reason: TCL
> LOOKUP CHANNEL exp5)
> which is not very helpful
>

@Christophe
Can I ask if your DejaGNU board setup has "needs_status_wrapper 1" for 
both of these boards?

I believe this problem is caused by an interaction with the DejaGNU 
status wrapper.

When the status wrapper is needed, DejaGNU looks at stdout for a line 
saying "*** EXIT code " indicating what the status code was.
When it doesn't find that line it assumes an exit code of 2.
Without the status wrapper DejaGNU takes the return code from the 
program executed.

Because these tests use "freopen ()" on stdout, the status wrapper fails 
to print to the IO channel DejaGNU is listening on, hence DejaGNU fails 
to find it's line, uses an exit code of 2, and fails the test.


@Martin
Would these tests still be valid having run freopen on stderr and using 
fprintf instead of printf?
That makes the testcases pass for me.

If not we could add an
    { dg-require-effective-target unwrapped }
directive in the testcases to stop the failure complaints.


Re: [PATCH 1/7][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf

2018-11-15 Thread Andreas Schwab
On Nov 14 2018, Jozef Lawrynowicz  wrote:

> The timeout as set in the dejagnu configuration for msp430
> ([dejagnu.git]/baseboards/msp430-sim.exp) is 30, which is rarely
> hit.

I don't think it makes sense for a board file to set a smaller timeout
than the default.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87626

2018-11-15 Thread Umesh Kalappa
Hi All,

The attached patch (pr85667.patch) fixes the subjected issue .
we tested on x86_64(linux and windows both) and no regress found .

ok to commit ?

Thank you
~Umesh


pr85667.patch
Description: Binary data


Re: Bug 52869 - [DR 1207] "this" not being allowed in noexcept clauses

2018-11-15 Thread Umesh Kalappa
Thank you Marek  for the inputs .
>>In the future, if using diff, please also use the -p option.
We are using svn diif  and other comments are addressed .

please let us know  your take on the  revised attached patch .

Thank you
~Umesh
On Thu, Nov 15, 2018 at 12:23 AM Marek Polacek  wrote:
>
> On Wed, Nov 14, 2018 at 09:55:39PM +0530, Umesh Kalappa wrote:
> > My bad Marek and thank you for pointing that out.
> >
> > Please find the attached correct one (pr52869.patch) .
>
> Index: gcc/cp/ChangeLog
> ===
> --- gcc/cp/ChangeLog(revision 266026)
> +++ gcc/cp/ChangeLog(working copy)
> @@ -1,3 +1,9 @@
> +2018-11-14  Kamlesh Kumar  
> +
> +   PR c++/52869
> +   *parser.c () :  restore the old current_class_{ptr,ref} by
> +   inject_this_parameter().
> +
>
> So the correct CL entry would look like
>
> 2018-11-14  Kamlesh Kumar  
>
> DR 1207
> PR c++/52869
> * parser.c (cp_parser_noexcept_specification_opt): Call
> inject_this_parameter.
>
> or so.
>
> Index: gcc/cp/parser.c
> ===
> --- gcc/cp/parser.c (revision 266026)
> +++ gcc/cp/parser.c (working copy)
> @@ -24615,11 +24615,24 @@
>  {
>tree expr;
>cp_lexer_consume_token (parser->lexer);
> -
> +
>
> You're adding whitespaces where they shouldn't be.  Let's avoid changes like 
> these.
>
>if (cp_lexer_peek_token (parser->lexer)->type == CPP_OPEN_PAREN)
> {
>   matching_parens parens;
>   parens.consume_open (parser);
> +
> + if (current_class_type)
> +  inject_this_parameter (current_class_type, TYPE_UNQUALIFIED);
> +  else
> +{
> +  /*clear the current_class_ptr for non class type , like
> +   int foo() noexcept(*this)
> +   {
> + return 1;
> +   }
> + */
> +current_class_ptr = NULL_TREE;
> +}
>
> I don't believe that's what Jason meant by restoring; I think you want
>
>   tree save_ccp = current_class_ptr;
>   tree save_ccr = current_class_ref;
>
>   inject_this_parameter (current_class_type, TYPE_UNQUALIFIED);
>
>   [...]
>
>   current_class_ptr = save_ccp;
>   current_class_ref = save_ccr;
>
> In the future, if using diff, please also use the -p option.
>
> Index: gcc/testsuite/ChangeLog
> ===
> --- gcc/testsuite/ChangeLog (revision 266026)
> +++ gcc/testsuite/ChangeLog (working copy)
> @@ -1,3 +1,8 @@
> +2018-11-14  Kamlesh Kumar  
> +
> +   PR g++.dg/52869
> +   * g++.dg/pr52869.C: New.
>
> Should be "PR c++/52869".
>
> Index: gcc/testsuite/g++.dg/pr52869.C
> ===
> --- gcc/testsuite/g++.dg/pr52869.C  (nonexistent)
> +++ gcc/testsuite/g++.dg/pr52869.C  (working copy)
>
> Maybe move the test to testsuite/g++.dg/DRs?
>
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O0 -g" } */
>
> Why these options?  I don't think you need -g.
>
> +struct S {
> +void f() { }
> +void g() noexcept(noexcept(f())) { }
> +void h() noexcept(noexcept(this->f())) { }
> +};
> +
> +struct Nyan {
> +   constexpr Nyan ++() noexcept { return *this; }
> +   constexpr void omg() noexcept(noexcept(++*this)) {}
> +};
>
> I was hoping you'd add also a test with 'this' in noexcept in a class 
> template.
>
> This test doesn't compile on all dialects:
> FAIL: g++.dg/pr52869.C  -std=gnu++98 (test for excess errors)
> FAIL: g++.dg/pr52869.C  -std=gnu++11 (test for excess errors)
>
> You can run just the new test in all dialects using:
> GXX_TESTSUITE_STDS=98,11,14,17,2a make check-c++ RUNTESTFLAGS=dg.exp=pr52869.C
>
> The noexcept specifier is only in C++11 and newer I think.
>
> +template< typename T >
> +T sine( T const& a, T const& b ) noexcept
> +{
> +static_assert( noexcept( T(a / sqrt(a * a  + b * b)) ), "throwing expr" 
> );
> +return a / sqrt(a * a  + b * b);
> +}
> +
> +int foo() noexcept
> +{
> +  return 1;
> +}
> +
>
> I don't understand what this part of the test is testing.  It compiles even
> without the patch.  Let's drop it.
>
> Marek


pr52869.patch
Description: Binary data


Re: [PATCH][RFC] Come up with -flive-patching master option.

2018-11-15 Thread Martin Liška
On 11/14/18 6:54 PM, Qing Zhao wrote:
> Hi, 
> 
> 
>> On Nov 14, 2018, at 9:03 AM, Martin Liška  wrote:
>>

>>> Yes, you are right. I added this into my patch.
>>>
>>> I am attaching the new patch here.
>>
>> Hello.
>>
>> Please use
>> git diff HEAD~ > /tmp/patch && ~/Programming/gcc/contrib/check_GNU_style.py 
>> /tmp/patch
>>
>> in order to address many formatting issues of the patch (skip the ones 
>> reported in common.opt).
> 
> will do and fix the style issues.
> 
>>
>>>
>>>
>>> +flive-patching
>>> +Common RejectNegative Alias(flive-patching=,inline-clone) Optimization
>>> +
>>> +flive-patching=
>>> +Common Report Joined RejectNegative Enum(live_patching_level) 
>>> Var(flag_live_patching) Init(LIVE_NONE) Optimization
>>> +-flive-patching=[inline-only-static|inline-clone]  Control ipa 
>>> optimizations to provide a 
>>
>> Please use 'IPA' instead of 'ipa', similarly in documentation.
> Okay.
>>
>>> --- a/gcc/doc/invoke.texi
>>> +++ b/gcc/doc/invoke.texi
>>> @@ -411,10 +411,11 @@ Objective-C and Objective-C++ Dialects}.
>>> -fgcse-sm  -fhoist-adjacent-loads  -fif-conversion @gol
>>> -fif-conversion2  -findirect-inlining @gol
>>> -finline-functions  -finline-functions-called-once  -finline-limit=@var{n} 
>>> @gol
>>> --finline-small-functions  -fipa-cp  -fipa-cp-clone @gol
>>> +-finline-small-functions -fipa-cp  -fipa-cp-clone @gol
>>
>> This changes is probably not intended.
> No. will delete it.
> 
>>
>>>
>>> diff --git a/gcc/flag-types.h b/gcc/flag-types.h
>>> index 500f663..72e0f0f 100644
>>> --- a/gcc/flag-types.h
>>> +++ b/gcc/flag-types.h
>>> @@ -123,6 +123,14 @@ enum stack_reuse_level
>>>   SR_ALL
>>> };
>>>
>>> +/* The live patching level. */
>>> +enum live_patching_level
>>> +{
>>> +  LIVE_NONE = 0,
>>> +  LIVE_INLINE_ONLY_STATIC,
>>> +  LIVE_INLINE_CLONE
>>
>> Maybe better LIVE_PATCHING_INLINE_CLONE, without the 'PATCHING' the enum
>> values are bit unclear.
> 
> Okay.
>>
>>>
>>> +  /* visibility change should be excluded by !flag_whole_program 
>>> +&& !in_lto_p && !flag_ipa_cp_clone && !flag_ipa_sra 
>>
>> You added sorry about LTO, maybe then !in_lto_p would be always true?
> 
> Yes, since live-patching does not support LTO currently,  !in_lto_p is always 
> TRUE. that’s the reason no need for a new flag to disable visibility change. 

Hi.

Ok.

>>
>>> +&& !flag_partial_inlining.  */
>>> +  if (!opts_set->x_flag_ipa_pta)
>>> +opts->x_flag_ipa_pta = 0;
>>> +  if (!opts_set->x_flag_ipa_reference)
>>> +opts->x_flag_ipa_reference = 0;
>>> +  if (!opts_set->x_flag_ipa_ra)
>>> +opts->x_flag_ipa_ra = 0;
>>> +  if (!opts_set->x_flag_ipa_icf)
>>> +opts->x_flag_ipa_icf = 0;
>>> +  if (!opts_set->x_flag_ipa_icf_functions)
>>> +opts->x_flag_ipa_icf_functions = 0;
>>> +  if (!opts_set->x_flag_ipa_icf_variables)
>>> +opts->x_flag_ipa_icf_variables = 0;
>>> +  if (!opts_set->x_flag_ipa_bit_cp)
>>> +opts->x_flag_ipa_bit_cp = 0;
>>> +  if (!opts_set->x_flag_ipa_vrp)
>>> +opts->x_flag_ipa_vrp = 0;
>>> +  if (!opts_set->x_flag_ipa_pure_const)
>>
>> Can you please explain why you included:
>>  if (!opts_set->x_flag_ipa_bit_cp)
>>opts->x_flag_ipa_bit_cp = 0;
>>  if (!opts_set->x_flag_ipa_vrp)
>>opts->x_flag_ipa_vrp = 0;
> 
> interprocedural bitwise constant propagation and interprocedural propagation 
> of value ranges does not involve creating clones,
> and the bitwise constant and value ranges info extracted during ipa-cp phase 
> are used later by other optimizations. their effect on 
> impact functions are not clear at this moment. that’s the reason I think we 
> need to disable these two. 
> 
> Martin Jambor raised this issue during our previous discussion on 10/03/2018:
> “
> I was thinking a bit more about this and recalled that not all stuff
> that IPA-CP nowadays does involves creating clones, so we have to add
> also:
>  - -fno-ipa-bit-cp, and
>  - -fno-ipa-vrp.
> 
> These two just record info in the parameters of *callees* of functions
> from which it extracted info, without any cloning involved.  Both were
> introduced in GCC 7.
> 
> Thanks,
> 
> Martin
> “
> and I think he is right.

Great, thanks for clarification! I forgot about that.

And please can you mention in documentation which options are disabled with 
-flive-patching=*?
We usually do it, e.g. take a look at '-Os' option:

```
‘-Os’ disables the following optimization flags:
-falign-functions -falign-jumps -falign-loops
-falign-labels -freorder-blocks -freorder-blocks-algorithm=stc
-freorder-blocks-and-partition -fprefetch-loop-arrays
```

> 
> 
> 
>> ?
>>
>>> +opts->x_flag_ipa_pure_const = 0;
>>> +  /* unreachable code removal.  */
>>> +  /* discovery of functions/variables with no address taken.  */
>>
>> ^^^ these 2 comments looks misaligned.
> 
> will fix them. 
>>
>>>
>>> +++ b/gcc/testsuite/gcc.dg/live-patching-1.c
>>> @@ -0,0 +1,22 @@
>>> +/* { dg-do compile } */
>>> +/* { 

Re: [PATCH] Fix ICE in fixup_abnormal_edges (PR rtl-optimization/88018)

2018-11-15 Thread Richard Biener
On Thu, 15 Nov 2018, Jakub Jelinek wrote:

> Hi!
> 
> On the following testcase, we have a call (not marked noreturn), which can
> throw, followed immediately by __builtin_unreachable (), so it effectively
> is noreturn in this particular call site (i.e. if it returns, it is UB).
> In RTL this is represented by the corresponding bb having just EH edge
> successor, no fallthrough edge.  If one of the callers of
> fixup_abnormal_edges (the stack pass in this case, or LRA or reload) adds
> some instructions after the call and calls fixup_abnormal_edges, it ICEs,
> as it tries to delete those insns and insert them on the NULL edge.
> Somebody has hit that issue already but solved it for USE insns only, this
> patch does that for all the cases where we don't have the fallthru edge.
> If there weren't any instructions after the call before one of these 3
> passes, I'd say it must be UB to return from such a call and thus it should
> be unimportant if those insns are executed or not.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> 2018-11-15  Jakub Jelinek  
> 
>   PR rtl-optimization/88018
>   * cfgrtl.c (fixup_abnormal_edges): Guard moving insns to fallthru edge
>   on the presence of fallthru edge, rather than if it is a USE or not.
> 
>   * g++.dg/tsan/pr88018.C: New test.
> 
> --- gcc/cfgrtl.c.jj   2018-11-14 00:55:51.695333876 +0100
> +++ gcc/cfgrtl.c  2018-11-14 17:24:46.414915101 +0100
> @@ -3332,8 +3332,15 @@ fixup_abnormal_edges (void)
>If it's placed after a trapping call (i.e. that
>call is the last insn anyway), we have no fallthru
>edge.  Simply delete this use and don't try to insert
> -  on the non-existent edge.  */
> -   if (GET_CODE (PATTERN (insn)) != USE)
> +  on the non-existent edge.
> +  Similarly, sometimes a call that can throw is
> +  followed in the source with __builtin_unreachable (),
> +  meaning that there is UB if the call returns rather
> +  than throws.  If there weren't any instructions
> +  following such calls before, supposedly even the ones
> +  we've deleted aren't significant and can be
> +  removed.  */
> +   if (e)
>   {
> /* We're not deleting it, we're moving it.  */
> insn->set_undeleted ();
> --- gcc/testsuite/g++.dg/tsan/pr88018.C.jj2018-11-14 17:26:46.224944969 
> +0100
> +++ gcc/testsuite/g++.dg/tsan/pr88018.C   2018-11-14 17:28:40.057073142 
> +0100
> @@ -0,0 +1,6 @@
> +// PR rtl-optimization/88018
> +// { dg-do compile }
> +// { dg-skip-if "" { *-*-* }  { "*" } { "-O0" } }
> +// { dg-options "-fsanitize=thread -fno-ipa-pure-const -O1 
> -fno-inline-functions-called-once -w" }
> +
> +#include "../pr69667.C"
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT

2018-11-15 Thread Richard Biener
On Thu, 15 Nov 2018, Richard Biener wrote:

> On Wed, 14 Nov 2018, Andrew Pinski wrote:
> 
> > On Fri, May 13, 2016 at 3:51 AM Richard Biener  wrote:
> > >
> > >
> > > The following patch adds BIT_FIELD_INSERT, an operation to
> > > facilitate doing bitfield inserts on registers (as opposed
> > > to currently where we'd have a BIT_FIELD_REF store).
> > >
> > > Originally this was developed as part of bitfield lowering
> > > where bitfield stores were lowered into read-modify-write
> > > cycles and the modify part, instead of doing shifting and masking,
> > > be kept in a more high-level form to ease combining them.
> > >
> > > A second use case (the above is still valid) is vector element
> > > inserts which we currently can only do via memory or
> > > by extracting all components and re-building the vector using
> > > a CONSTRUCTOR.  For this second use case I added code
> > > re-writing the BIT_FIELD_REF stores the C family FEs produce
> > > into BIT_FIELD_INSERT when update-address-taken can otherwise
> > > re-write a decl into SSA form (the testcase shows we miss
> > > a similar opportunity with the MEM_REF form of a vector insert,
> > > I plan to fix that for the final submission).
> > >
> > > One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
> > > is that the size of the insertion is given implicitely via the
> > > type size/precision of the value to insert.  That avoids
> > > introducing ways to have quaternary ops in folding and GIMPLE stmts.
> > >
> > > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > >
> > > Richard.
> > >
> > > 2011-06-16  Richard Guenther  
> > >
> > > PR tree-optimization/29756
> > > * tree.def (BIT_FIELD_INSERT): New tcc_expression tree code.
> > > * expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT.
> > > * fold-const.c (operand_equal_p): Likewise.
> > > (fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT.
> > > * gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT.
> > > * tree-inline.c (estimate_operator_cost): Likewise.
> > > * tree-pretty-print.c (dump_generic_node): Likewise.
> > > * tree-ssa-operands.c (get_expr_operands): Likewise.
> > > * cfgexpand.c (expand_debug_expr): Likewise.
> > > * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
> > > * gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT.
> > > * tree-cfg.c (verify_gimple_assign_ternary): Verify 
> > > BIT_FIELD_INSERT.
> > >
> > > * tree-ssa.c (non_rewritable_lvalue_p): We can rewrite
> > > vector inserts using BIT_FIELD_REF on the lhs.
> > > (execute_update_addresses_taken): Do it.
> > >
> > > * gcc.dg/tree-ssa/vector-6.c: New testcase.
> > >
> > > Index: trunk/gcc/expr.c
> > > ===
> > > *** trunk.orig/gcc/expr.c   2016-05-12 13:40:30.704262951 +0200
> > > --- trunk/gcc/expr.c2016-05-12 15:40:32.481225744 +0200
> > > *** expand_expr_real_2 (sepops ops, rtx targ
> > > *** 9358,9363 
> > > --- 9358,9380 
> > > target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, 
> > > target);
> > > return target;
> > >
> > > + case BIT_FIELD_INSERT:
> > > +   {
> > > +   unsigned bitpos = tree_to_uhwi (treeop2);
> > > +   unsigned bitsize;
> > > +   if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1)))
> > > + bitsize = TYPE_PRECISION (TREE_TYPE (treeop1));
> > > +   else
> > > + bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1)));
> > > +   rtx op0 = expand_normal (treeop0);
> > > +   rtx op1 = expand_normal (treeop1);
> > > +   rtx dst = gen_reg_rtx (mode);
> > > +   emit_move_insn (dst, op0);
> > > +   store_bit_field (dst, bitsize, bitpos, 0, 0,
> > > +TYPE_MODE (TREE_TYPE (treeop1)), op1, false);
> > > +   return dst;
> > > +   }
> > > +
> > >   default:
> > > gcc_unreachable ();
> > >   }
> > > Index: trunk/gcc/fold-const.c
> > > ===
> > > *** trunk.orig/gcc/fold-const.c 2016-05-12 13:40:30.704262951 +0200
> > > --- trunk/gcc/fold-const.c  2016-05-13 09:41:13.509812127 +0200
> > > *** operand_equal_p (const_tree arg0, const_
> > > *** 3163,3168 
> > > --- 3163,3169 
> > >
> > > case VEC_COND_EXPR:
> > > case DOT_PROD_EXPR:
> > > +   case BIT_FIELD_INSERT:
> > >   return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
> > >
> > > default:
> > > *** fold_ternary_loc (location_t loc, enum t
> > > *** 11870,11875 
> > > --- 11871,11916 
> > > }
> > > return NULL_TREE;
> > >
> > > + case BIT_FIELD_INSERT:
> > > +   /* Perform (partial) constant folding of BIT_FIELD_INSERT.  */
> > > +   if (TREE_CODE (arg0) == INTEGER_CST
> > > + && TREE_CODE (arg1) == 

Re: [PATCH][RFC] Introduce BIT_FIELD_INSERT

2018-11-15 Thread Richard Biener
On Wed, 14 Nov 2018, Andrew Pinski wrote:

> On Fri, May 13, 2016 at 3:51 AM Richard Biener  wrote:
> >
> >
> > The following patch adds BIT_FIELD_INSERT, an operation to
> > facilitate doing bitfield inserts on registers (as opposed
> > to currently where we'd have a BIT_FIELD_REF store).
> >
> > Originally this was developed as part of bitfield lowering
> > where bitfield stores were lowered into read-modify-write
> > cycles and the modify part, instead of doing shifting and masking,
> > be kept in a more high-level form to ease combining them.
> >
> > A second use case (the above is still valid) is vector element
> > inserts which we currently can only do via memory or
> > by extracting all components and re-building the vector using
> > a CONSTRUCTOR.  For this second use case I added code
> > re-writing the BIT_FIELD_REF stores the C family FEs produce
> > into BIT_FIELD_INSERT when update-address-taken can otherwise
> > re-write a decl into SSA form (the testcase shows we miss
> > a similar opportunity with the MEM_REF form of a vector insert,
> > I plan to fix that for the final submission).
> >
> > One speciality of BIT_FIELD_INSERT as opposed to BIT_FIELD_REF
> > is that the size of the insertion is given implicitely via the
> > type size/precision of the value to insert.  That avoids
> > introducing ways to have quaternary ops in folding and GIMPLE stmts.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >
> > Richard.
> >
> > 2011-06-16  Richard Guenther  
> >
> > PR tree-optimization/29756
> > * tree.def (BIT_FIELD_INSERT): New tcc_expression tree code.
> > * expr.c (expand_expr_real_2): Handle BIT_FIELD_INSERT.
> > * fold-const.c (operand_equal_p): Likewise.
> > (fold_ternary_loc): Add constant folding of BIT_FIELD_INSERT.
> > * gimplify.c (gimplify_expr): Handle BIT_FIELD_INSERT.
> > * tree-inline.c (estimate_operator_cost): Likewise.
> > * tree-pretty-print.c (dump_generic_node): Likewise.
> > * tree-ssa-operands.c (get_expr_operands): Likewise.
> > * cfgexpand.c (expand_debug_expr): Likewise.
> > * gimple-pretty-print.c (dump_ternary_rhs): Likewise.
> > * gimple.c (get_gimple_rhs_num_ops): Handle BIT_FIELD_INSERT.
> > * tree-cfg.c (verify_gimple_assign_ternary): Verify 
> > BIT_FIELD_INSERT.
> >
> > * tree-ssa.c (non_rewritable_lvalue_p): We can rewrite
> > vector inserts using BIT_FIELD_REF on the lhs.
> > (execute_update_addresses_taken): Do it.
> >
> > * gcc.dg/tree-ssa/vector-6.c: New testcase.
> >
> > Index: trunk/gcc/expr.c
> > ===
> > *** trunk.orig/gcc/expr.c   2016-05-12 13:40:30.704262951 +0200
> > --- trunk/gcc/expr.c2016-05-12 15:40:32.481225744 +0200
> > *** expand_expr_real_2 (sepops ops, rtx targ
> > *** 9358,9363 
> > --- 9358,9380 
> > target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, 
> > target);
> > return target;
> >
> > + case BIT_FIELD_INSERT:
> > +   {
> > +   unsigned bitpos = tree_to_uhwi (treeop2);
> > +   unsigned bitsize;
> > +   if (INTEGRAL_TYPE_P (TREE_TYPE (treeop1)))
> > + bitsize = TYPE_PRECISION (TREE_TYPE (treeop1));
> > +   else
> > + bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (treeop1)));
> > +   rtx op0 = expand_normal (treeop0);
> > +   rtx op1 = expand_normal (treeop1);
> > +   rtx dst = gen_reg_rtx (mode);
> > +   emit_move_insn (dst, op0);
> > +   store_bit_field (dst, bitsize, bitpos, 0, 0,
> > +TYPE_MODE (TREE_TYPE (treeop1)), op1, false);
> > +   return dst;
> > +   }
> > +
> >   default:
> > gcc_unreachable ();
> >   }
> > Index: trunk/gcc/fold-const.c
> > ===
> > *** trunk.orig/gcc/fold-const.c 2016-05-12 13:40:30.704262951 +0200
> > --- trunk/gcc/fold-const.c  2016-05-13 09:41:13.509812127 +0200
> > *** operand_equal_p (const_tree arg0, const_
> > *** 3163,3168 
> > --- 3163,3169 
> >
> > case VEC_COND_EXPR:
> > case DOT_PROD_EXPR:
> > +   case BIT_FIELD_INSERT:
> >   return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);
> >
> > default:
> > *** fold_ternary_loc (location_t loc, enum t
> > *** 11870,11875 
> > --- 11871,11916 
> > }
> > return NULL_TREE;
> >
> > + case BIT_FIELD_INSERT:
> > +   /* Perform (partial) constant folding of BIT_FIELD_INSERT.  */
> > +   if (TREE_CODE (arg0) == INTEGER_CST
> > + && TREE_CODE (arg1) == INTEGER_CST)
> > +   {
> > + unsigned HOST_WIDE_INT bitpos = tree_to_uhwi (op2);
> > + unsigned bitsize = TYPE_PRECISION (TREE_TYPE (arg1));
> > + wide_int tem = wi::bit_and (arg0,
> > + wi::shifted_mask 

[PATCH] Fix PR87917

2018-11-15 Thread Richard Biener


The following side-steps a possible issue with the 
evolution_function_is_affine_multivariate_p predicate by guarding
the call to analyze_subscript_affine_affine in analyze_miv_subscript
in the same way as the call from analyze_siv_subscript.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2018-11-15  Richard Biener  

PR middle-end/87917
* tree-data-ref.c (analyze_miv_subscript): Guard calls to
analyze_subscript_affine_affine properly.

* gcc.dg/tree-ssa/pr87917.c: New testcase.

Index: gcc/tree-data-ref.c
===
--- gcc/tree-data-ref.c (revision 266145)
+++ gcc/tree-data-ref.c (working copy)
@@ -3994,9 +3993,9 @@ analyze_miv_subscript (tree chrec_a,
   dependence_stats.num_miv_independent++;
 }
 
-  else if (evolution_function_is_affine_multivariate_p (chrec_a, 
loop_nest->num)
+  else if (evolution_function_is_affine_in_loop (chrec_a, loop_nest->num)
   && !chrec_contains_symbols (chrec_a)
-  && evolution_function_is_affine_multivariate_p (chrec_b, 
loop_nest->num)
+  && evolution_function_is_affine_in_loop (chrec_b, loop_nest->num)
   && !chrec_contains_symbols (chrec_b))
 {
   /* testsuite/.../ssa-chrec-35.c

Index: gcc/testsuite/gcc.dg/tree-ssa/pr87917.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr87917.c (nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr87917.c (working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribute-patterns -ftrapv -fno-tree-fre" } 
*/
+
+void foo(int x[])
+{
+ int i, j;
+
+ for (i = 0; i < 2; i++)
+   for (j = 0; j < 2; j++)
+   {
+ x[i] = x[i*j];
+ x[i] = x[i*j];
+   }
+}