gcc: docs: Fix documentation of two hooks

2024-04-08 Thread Matthew Malcomson
The `function_attribute_inlinable_p` hook documentation described it
returning the value if it is OK to inline the provided fndecl into "the
current function".  AFAICS This hook is only called when
`current_function_decl` is the same as the `fndecl` argument that the
hook is given, hence asking whether `fndecl` can be inlined into "the
current function" doesn't make sense.
Update the documentation to match this understanding.

The `unspec_may_trap_p` documentation mentioned applying to either
`unspec` or `unspec_volatile`.  AFAICS this hook is only used for
`unspec` codes since c84a808e493a, so I removed the mention of
`unspec_volatile`.

gcc/ChangeLog:

* doc/tm.texi (function_attribute_inlinable_p,
unspec_may_trap_p): Update documentation.
* target.def (function_attribute_inlinable_p,
unspec_may_trap_p): Update documentation.

--
N.b. not entirely sure who to ask for review, went with docs maintainers, but
if that's incorrect please do redirect me.

### Attachment also inlined for ease of reply###


diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 
c8b8b126b2424b6552f824ba42ac329cfaf84d84..f0051f0ae1e9444d5d585135c90a68ca760c2fbd
 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -10752,10 +10752,10 @@ attribute handlers.  So far this only affects the 
@var{noinit} and
 
 @deftypefn {Target Hook} bool TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P 
(const_tree @var{fndecl})
 @cindex inlining
-This target hook returns @code{true} if it is OK to inline @var{fndecl}
-into the current function, despite its having target-specific
-attributes, @code{false} otherwise.  By default, if a function has a
-target specific attribute attached to it, it will not be inlined.
+This target hook returns @code{false} if the target-specific attributes on
+@var{fndecl} always block it getting inlined, @code{true} otherwise.  By
+default, if a function has a target specific attribute attached to it, it
+will not be inlined.
 @end deftypefn
 
 @deftypefn {Target Hook} bool TARGET_OPTION_VALID_ATTRIBUTE_P (tree 
@var{fndecl}, tree @var{name}, tree @var{args}, int @var{flags})
@@ -12245,12 +12245,10 @@ allocation.
 @end deftypefn
 
 @deftypefn {Target Hook} int TARGET_UNSPEC_MAY_TRAP_P (const_rtx @var{x}, 
unsigned @var{flags})
-This target hook returns nonzero if @var{x}, an @code{unspec} or
-@code{unspec_volatile} operation, might cause a trap.  Targets can use
-this hook to enhance precision of analysis for @code{unspec} and
-@code{unspec_volatile} operations.  You may call @code{may_trap_p_1}
-to analyze inner elements of @var{x} in which case @var{flags} should be
-passed along.
+This target hook returns nonzero if @var{x}, an @code{unspec} might cause
+a trap.  Targets can use this hook to enhance precision of analysis for
+@code{unspec} operations.  You may call @code{may_trap_p_1} to analyze inner
+elements of @var{x} in which case @var{flags} should be passed along.
 @end deftypefn
 
 @deftypefn {Target Hook} void TARGET_SET_CURRENT_FUNCTION (tree @var{decl})
diff --git a/gcc/target.def b/gcc/target.def
index 
fdad7bbc93e2ad8aea30336d5cd4af67801e9c74..2b2a6c11807eff228788fae1cd1370e8971fbf3e
 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2314,10 +2314,10 @@ attribute handlers.  So far this only affects the 
@var{noinit} and\n\
 DEFHOOK
 (function_attribute_inlinable_p,
  "@cindex inlining\n\
-This target hook returns @code{true} if it is OK to inline @var{fndecl}\n\
-into the current function, despite its having target-specific\n\
-attributes, @code{false} otherwise.  By default, if a function has a\n\
-target specific attribute attached to it, it will not be inlined.",
+This target hook returns @code{false} if the target-specific attributes on\n\
+@var{fndecl} always block it getting inlined, @code{true} otherwise.  By\n\
+default, if a function has a target specific attribute attached to it, it\n\
+will not be inlined.",
  bool, (const_tree fndecl),
  hook_bool_const_tree_false)
 
@@ -4057,12 +4057,10 @@ allocation.",
FLAGS has the same meaning as in rtlanal.cc: may_trap_p_1.  */
 DEFHOOK
 (unspec_may_trap_p,
- "This target hook returns nonzero if @var{x}, an @code{unspec} or\n\
-@code{unspec_volatile} operation, might cause a trap.  Targets can use\n\
-this hook to enhance precision of analysis for @code{unspec} and\n\
-@code{unspec_volatile} operations.  You may call @code{may_trap_p_1}\n\
-to analyze inner elements of @var{x} in which case @var{flags} should be\n\
-passed along.",
+ "This target hook returns nonzero if @var{x}, an @code{unspec} might cause\n\
+a trap.  Targets can use this hook to enhance precision of analysis for\n\
+@code{unspec} operations.  You may call @code{may_trap_p_1} to analyze inner\n\
+elements of @var{x} in which case @var{flags} should be passed along.",
  int, (const_rtx x, unsigned flags),
  default_unspec_may_trap_p)
 



diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 

Re: [PATCH] arm: fix c23 0-named-args caller-side stdarg

2024-02-26 Thread Matthew Malcomson

Hi Alexandre,

I don't have the ability to OK the patch, but I'm attempting to do a 
review in

order to reduce the workload for any maintainer.  (Apologies for the slow
response).

I think you're right that the AAPCS32 requires all arguments to be passed in
registers for this testcase.
(Nit on the commit-message: It says that your reading of the AAPCS32 
suggests
that the *caller* is correct -- I believe based on the change you 
suggested you

meant *callee* is correct in expecting arguments in registers.)

The approach you suggest looks OK to me -- I do notice that it doesn't 
fix the

legacy ABI's of `atpcs` and `apcs` and guess it would be nicer to have them
working at the same time though would defer to maintainers on how 
important that

is.
(For the benefit of others reading) I don't believe there is any ABI concern
with this since it's fixing something that is currently not working at 
all and

only applies to c23 (so a change shouldn't have too much of an impact).

You mention you chose to make the change in the arm backend rather than 
general

code due to hesitancy to change the generic ABI-affecting code. That makes
sense to me, certainly at this late stage in the development cycle.

That said, I do expect the more robust solution would be to change a part of
the generic ABI code and would like to check that this makes sense to 
you and

anyone else upstream in the discussion (justification below).
So would suggest to the maintainers that maybe this goes in with a note to
remember to look at a possible generic code change later on.

Does that sound reasonable to you?

--
Justification for why a change to the generic ABI code would be better 
in the

end:

IIUC (from the documentation of the hooks) `pretend_outgoing_varargs_named`
really should mean that `n_named_args = num_actuals`.  That seems to be 
what the

documentation for `strict_argument_naming` indicates should happen.

``` (Documentation for TARGET_STRICT_ARGUMENT_NAMING):
If it returns 'false', but 'TARGET_PRETEND_OUTGOING_VARARGS_NAMED' returns
'true', then all arguments are treated as named.
```

Because of this I guess that if `pretend_outgoing_varargs_named` only 
means to
pretend the above *except* in the c23 0-named-args case that seems like 
it would

be a bit awkward to account for in backends.

From a quick check on c23-stdarg-4.c it does look like the below change 
ends up
with the same codegen as your patch (except in the case of those legacy 
ABI's,

where the below does make the caller and callee ABI match AFAICT):

```
  diff --git a/gcc/calls.cc b/gcc/calls.cc
  index 01f44734743..0b302f633ed 100644
  --- a/gcc/calls.cc
  +++ b/gcc/calls.cc
  @@ -2970,14 +2970,15 @@ expand_call (tree exp, rtx target, int ignore)
    we do not have any reliable way to pass unnamed args in
    registers, so we must force them into memory.  */

  -  if (type_arg_types != 0
  +  if ((type_arg_types != 0 || TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
 && targetm.calls.strict_argument_naming (args_so_far))
   ;
 else if (type_arg_types != 0
 && ! targetm.calls.pretend_outgoing_varargs_named 
(args_so_far))

   /* Don't include the last named arg.  */
   --n_named_args;
  -  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype))
  +  else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)
  +    && ! targetm.calls.pretend_outgoing_varargs_named (args_so_far))
   n_named_args = 0;
 else
   /* Treat all args as named.  */
```

Do you agree that this makes sense (i.e. is there something I'm 
completely missing)?


FWIW I attempted to try and find other targets which have
`strict_argument_naming` returning `false`, `pretend_outgoing_varargs_named`
returning true, and some use of the `.named` member of function 
arguments.  I

got the below list.  I recognise it doesn't mean there's a bug in these
backends, but thought it might help the discussion.
(lm32 mcore msp430 gcn cris fr30 frv h8300 arm v850 rx pru)



On 1/23/24 08:26, Alexandre Oliva wrote:

On Dec  5, 2023, Alexandre Oliva  wrote:


arm: fix c23 0-named-args caller-side stdarg

Ping?
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639472.html


The commit message doesn't name explicitly the fixed testsuite
failures.  Here they are:
FAIL: gcc.dg/c23-stdarg-4.c execution test
FAIL: gcc.dg/torture/c23-stdarg-split-1a.c   -O0  execution test
FAIL: gcc.dg/torture/c23-stdarg-split-1a.c   -O1  execution test
FAIL: gcc.dg/torture/c23-stdarg-split-1a.c   -O2  execution test
FAIL: gcc.dg/torture/c23-stdarg-split-1a.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  execution test
FAIL: gcc.dg/torture/c23-stdarg-split-1a.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  execution test
FAIL: gcc.dg/torture/c23-stdarg-split-1a.c   -O3 -g  execution test
FAIL: gcc.dg/torture/c23-stdarg-split-1a.c   -Os  execution test
Tested on arm-eabi.  Ok to install?




Re: [PATCH 0/12] GCC _BitInt support [PR102989]

2023-09-18 Thread Matthew Malcomson via Gcc-patches

On 8/9/23 19:14, Jakub Jelinek via Gcc-patches wrote:


It is enabled only on targets which have agreed on processor specific
ABI how to lay those out or pass as function arguments/return values,
which currently is just x86-64 I believe, would be nice if target maintainers
helped to get agreement on psABI changes and GCC 14 could enable it on far
more architectures than just one.


Hello,

Do you know of any other architectures in the process of defining a 
psABI for _BitInt (or for that matter having defined an ABI between then 
and now)?


We're currently working on completing the ABI for AArch64 and AArch32 
hoping that it could unblock the enablement for those arches.
We'd like to know about the potential clashes with other ABI's w.r.t. 
user-visible behaviour as part of the decision making.


I'd not found any other defined ABI's online, but thought it worth 
asking in case I just missed it.


Thanks,

Matthew



[PATCH] mid-end: Use integral time intervals in timevar.cc

2023-08-04 Thread Matthew Malcomson via Gcc-patches
Hopefully last update ...

> Specifically, please try compiling with
>-ftime-report -fdiagnostics-format=sarif-file
> and have a look at the generated .sarif file, e.g. via
>python -m json.tool foo.c.sarif
> which will pretty-print the JSON to stdout.

Rebasing onto the JSON output was quite simple -- I've inlined the only
change in the patch below (to cast to floating point seconds before
generating the json report).

I have manually checked the SARIF output as you suggested and all looks
good (an in fact better because we no longer save some strange times
like the below due to avoiding the floating point rounding).
  "wall": -4.49516e-09,


> 
> The patch looks OK to me if it passes bootstrap / regtest and the
> output of -ftime-report doesn't change (too much).
> 
> Thanks,
> Richard.

Though I don't expect you were asking for this, confirmation below that
the output doesn't change.  (Figured I may as well include that info
since the rebase to include the JSON output that David had just added
required re-sending an email anyway).



```
hw-a20-8:checking-theory [10:07:01] $ ${old_build}/gcc/xgcc -B${old_build}/gcc/ 
   -fdiagnostics-plain-output-Os  -w -S test-sum.c -o /dev/null   
-ftime-report

Time variable   usr   sys  wall 
  GGC
 phase setup:   0.01 ( 14%)   0.00 (  0%)   0.02 ( 18%) 
 3389k ( 74%)
 phase parsing  :   0.03 ( 43%)   0.02 ( 67%)   0.06 ( 55%) 
  982k ( 21%)
 phase opt and generate :   0.03 ( 43%)   0.01 ( 33%)   0.03 ( 27%) 
  215k (  5%)
 callgraph functions expansion  :   0.02 ( 29%)   0.00 (  0%)   0.03 ( 27%) 
  162k (  4%)
 callgraph ipa passes   :   0.01 ( 14%)   0.01 ( 33%)  -0.00 ( -0%) 
   38k (  1%)
 CFG verifier   :   0.00 (  0%)   0.00 (  0%)   0.01 (  9%) 
0  (  0%)
 preprocessing  :   0.02 ( 29%)   0.02 ( 67%)   0.02 ( 18%) 
  272k (  6%)
 lexical analysis   :   0.01 ( 14%)   0.00 (  0%)   0.02 ( 18%) 
0  (  0%)
 parser (global):   0.00 (  0%)   0.00 (  0%)   0.01 (  9%) 
  637k ( 14%)
 parser function body   :   0.00 (  0%)   0.00 (  0%)   0.01 (  9%) 
   18k (  0%)
 tree CFG cleanup   :   0.00 (  0%)   0.01 ( 33%)   0.00 (  0%) 
0  (  0%)
 tree STMT verifier :   0.01 ( 14%)   0.00 (  0%)   0.00 (  0%) 
0  (  0%)
 expand :   0.01 ( 14%)   0.00 (  0%)   0.00 (  0%) 
   12k (  0%)
 loop init  :   0.00 (  0%)   0.00 (  0%)   0.01 (  9%) 
   12k (  0%)
 initialize rtl :   0.00 (  0%)   0.00 (  0%)   0.01 (  9%) 
   15k (  0%)
 rest of compilation:   0.01 ( 14%)   0.00 (  0%)   0.00 (  0%) 
 6920  (  0%)
 TOTAL  :   0.07  0.03  0.11
 4587k
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --enable-checking=release to disable checks.
hw-a20-8:checking-theory [10:06:44] $ ${new_build}/gcc/xgcc -B${new_build}/gcc/ 
   -fdiagnostics-plain-output-Os  -w -S test-sum.c -o /dev/null   
-ftime-report

Time variable   usr   sys  wall 
  GGC
 phase setup:   0.01 ( 17%)   0.00 (  0%)   0.02 ( 18%) 
 3389k ( 74%)
 phase parsing  :   0.02 ( 33%)   0.03 ( 75%)   0.05 ( 45%) 
  982k ( 21%)
 phase opt and generate :   0.03 ( 50%)   0.01 ( 25%)   0.04 ( 36%) 
  215k (  5%)
 callgraph construction :   0.00 (  0%)   0.01 ( 25%)   0.00 (  0%) 
 1864  (  0%)
 callgraph functions expansion  :   0.02 ( 33%)   0.00 (  0%)   0.03 ( 27%) 
  162k (  4%)
 callgraph ipa passes   :   0.01 ( 17%)   0.00 (  0%)   0.01 (  9%) 
   38k (  1%)
 ipa free lang data :   0.00 (  0%)   0.00 (  0%)   0.01 (  9%) 
0  (  0%)
 CFG verifier   :   0.02 ( 33%)   0.00 (  0%)   0.00 (  0%) 
0  (  0%)
 preprocessing  :   0.01 ( 17%)   0.03 ( 75%)   0.01 (  9%) 
  272k (  6%)
 lexical analysis   :   0.01 ( 17%)   0.00 (  0%)   0.02 ( 18%) 
0  (  0%)
 parser (global):   0.00 (  0%)   0.00 (  0%)   0.01 (  9%) 
  637k ( 14%)
 parser inl. func. body :   0.00 (  0%)   0.00 (  0%)   0.01 (  9%) 
   19k (  0%)
 tree STMT verifier :   0.01 ( 17%)   0.00 (  0%)   0.00 (  0%) 
0  (  0%)
 expand :   0.00 (  0%)   0.00 (  0%)   0.01 (  9%) 
   12k (  0%)
 integrated RA  :   0.00 (  0%)   0.00 (  0%)   0.01 (  9%) 
   50k (  1%)
 initialize rtl :   0.00 (  0%)   0.00 (  0%)   0.01 (  9%) 
   15k (  0%)
 TOTAL  :   0.06  0.04  0.11
 4587k
Extra diagnostic checks enabled; compiler may run slowly.
Configure 

Re: [PATCH] mid-end: Use integral time intervals in timevar.cc

2023-08-03 Thread Matthew Malcomson via Gcc-patches

On 8/3/23 15:09, David Malcolm wrote:


Hi Matthew.  I recently touched the timevar code (in r14-2881-
g75d623946d4b6e) to add support for serializing the timevar data in
JSON form as part of the SARIF output (PR analyzer/109361).

Looking at your patch, it looks like the baseline for the patch seems
to predate r14-2881-g75d623946d4b6e.

I don't have a strong opinion on the implementation choices in your
patch, but please can you rebase to beyond my recent change and make
sure that the SARIF serialization still works with your patch.

Specifically, please try compiling with
   -ftime-report -fdiagnostics-format=sarif-file
and have a look at the generated .sarif file, e.g. via
   python -m json.tool foo.c.sarif
which will pretty-print the JSON to stdout.

Currently I'm writing out the values as floating-point seconds, and
AFAIK my analyzer integration testsuite [1] is the only consumer of
this data.


Hi David,

Thanks for the heads-up.  Will update the patch.

I read your last paragraph as suggesting that you'd be open to changing 
the format.  Is that correct?


I would initially assume that writing out the time as floating-point 
seconds would still be most convenient for your use since it looks to be 
like something to be presented to a person.


However, since I don't know much about the intended uses of SARIF in 
general I figured I should double-check -- does that choice to remain 
printing out floating-point seconds seem best to you?




[...snip...]

Thanks
Dave
[1]
https://github.com/davidmalcolm/gcc-analyzer-integration-tests/issues/5





[PATCH] mid-end: Use integral time intervals in timevar.cc

2023-08-03 Thread Matthew Malcomson via Gcc-patches
> 
> I think this is undesriable.  With fused you mean we use FMA?
> I think you could use -ffp-contract=off for the TU instead.
> 
> Note you can't use __attribute__((noinline)) literally since the
> host compiler might not support this.
> 
> Richard.
> 


Trying to make the timevar store integral time intervals.
Hope this is acceptable -- I had originally planned to use
`-ffp-contract` as agreed until I saw the email mentioning the old x86
bug in the same area which was not to do with floating point contraction
of operations (PR 99903) and figured it would be better to try and solve
both at the same time while making things in general a bit more robust.



On some AArch64 bootstrapped builds, we were getting a flaky test
because the floating point operations in `get_time` were being fused
with the floating point operations in `timevar_accumulate`.

This meant that the rounding behaviour of our multiplication with
`ticks_to_msec` was different when used in `timer::start` and when
performed in `timer::stop`.  These extra inaccuracies led to the
testcase `g++.dg/ext/timevar1.C` being flaky on some hardware.

--
Avoiding the inlining which was agreed to be undesirable.  Three
alternative approaches:
1) Use `-ffp-contract=on` to avoid this particular optimisation.
2) Adjusting the code so that the "tolerance" is always of the order of
   a "tick".
3) Recording times and elapsed differences in integral values.
   - Could be in terms of a standard measurement (e.g. nanoseconds or
 microseconds).
   - Could be in terms of whatever integral value ("ticks" /
 seconds / "clock ticks") is returned from the syscall
 chosen at configure time.

While `-ffp-contract=on` removes the problem that I bumped into, there
has been a similar bug on x86 that was to do with a different floating
point problem that also happens after `get_time` and
`timevar_accumulate` both being inlined into the same function.  Hence
it seems worth choosing a different approach.

Of the two other solutions, recording measurements in integral values
seems the most robust against slightly "off" measurements being
presented to the user -- even though it could avoid the ICE that creates
a flaky test.

I considered storing time in whatever units our syscall returns and
normalising them at the time we print out rather than normalising them
to nanoseconds at the point we record our "current time".  The logic
being that normalisation could have some rounding affect (e.g. if
TICKS_PER_SECOND is 3) that would be taken into account in calculations.

I decided against it in order to give the values recorded in
`timevar_time_def` some interpretive value so it's easier to read the
code.  Compared to the small rounding that would represent a tiny amount
of time and AIUI can not trigger the same kind of ICE's as we are
attempting to fix, said interpretive value seems more valuable.

Recording time in microseconds seemed reasonable since all obvious
values for ticks and `getrusage` are at microsecond granularity or less
precise.  That said, since TICKS_PER_SECOND and CLOCKS_PER_SEC are both
variables given to use by the host system I was not sure of that enough
to make this decision.

--
timer::all_zero is ignoring rows which are inconsequential to the user
and would be printed out as all zeros.  Since upon printing rows we
convert to the same double value and print out the same precision as
before, we return true/false based on the same amount of time as before.

timer::print_row casts to a floating point measurement in units of
seconds as was printed out before.

timer::validate_phases -- I'm printing out nanoseconds here rather than
floating point seconds since this is an error message for when things
have "gone wrong" printing out the actual nanoseconds that have been
recorded seems like the best approach.
N.b. since we now print out nanoseconds instead of floating point value
the padding requirements are different.  Originally we were padding to
24 characters and printing 18 decimal places.  This looked odd with the
now visually smaller values getting printed.  I judged 13 characters
(corresponding to 2 hours) to be a reasonable point at which our
alignment could start to degrade and this provides a more compact output
for the majority of cases (checked by triggering the error case via
GDB).

--
N.b. I use a literal 10 for "NANOSEC_PER_SEC".  I believe this
would fit in an integer on all hosts that GCC supports, but am not
certain there are not strange integer sizes we support hence am pointing
it out for special attention during review.

--
No expected change in generated code.
Bootstrapped and regtested on AArch64 with no regressions.
Manually checked that flaky test is no longer flaky on the machine it
was seen before.

gcc/ChangeLog:

PR 

[committed] Add check_vect in a testcase

2023-07-26 Thread Matthew Malcomson via Gcc-patches
Also reformat a comment that had too long lines.
Commit c5bd0e5870a introduced both these problems to fix.

Committed as obvious after ensuring that when running the testcase on
AArch64 it still fails before the original c5bd0e5870a commit and passes
after it.

gcc/ChangeLog:

* tree-vect-stmts.cc (get_group_load_store_type): Reformat
comment.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-multi-peel-gaps.c: Add `check_vect` call into
`main` of this testcase.



### Attachment also inlined for ease of reply###


diff --git a/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c 
b/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c
index 
1aab4c5a14d1e8346d89587bd9544a1516535a45..7e7db3c7a45fa1ecc748f51353170ec66c5dfed7
 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c
@@ -9,6 +9,7 @@
 /* { dg-require-effective-target mmap } */
 #include 
 #include 
+#include "tree-vect.h"
 
 #define MMAP_SIZE 0x2
 #define ADDRESS 0x112200
@@ -24,6 +25,7 @@ void initialise_s(int *s) { }
 int main() {
 void *s_mapping;
 void *end_s;
+check_vect ();
 s_mapping = mmap ((void *)ADDRESS, MMAP_SIZE, PROT_READ | PROT_WRITE,
  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 if (s_mapping == MAP_FAILED)
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 
99e61f8a85dc6cebbb3cd4183f8bd7fdfe9aa1de..5018bd23e6ec457a8e4790d333752b7e009228e0
 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -2214,8 +2214,9 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info 
stmt_info,
 we can end up with no gap recorded but still excess
 elements accessed, see PR103116.  Make sure we peel for
 gaps if necessary and sufficient and give up if not.
-If there is a combination of the access not covering the full 
vector and
-a gap recorded then we may need to peel twice.  */
+
+If there is a combination of the access not covering the full
+vector and a gap recorded then we may need to peel twice.  */
  if (loop_vinfo
  && *memory_access_type == VMAT_CONTIGUOUS
  && SLP_TREE_LOAD_PERMUTATION (slp_node).exists ()



diff --git a/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c 
b/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c
index 
1aab4c5a14d1e8346d89587bd9544a1516535a45..7e7db3c7a45fa1ecc748f51353170ec66c5dfed7
 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c
@@ -9,6 +9,7 @@
 /* { dg-require-effective-target mmap } */
 #include 
 #include 
+#include "tree-vect.h"
 
 #define MMAP_SIZE 0x2
 #define ADDRESS 0x112200
@@ -24,6 +25,7 @@ void initialise_s(int *s) { }
 int main() {
 void *s_mapping;
 void *end_s;
+check_vect ();
 s_mapping = mmap ((void *)ADDRESS, MMAP_SIZE, PROT_READ | PROT_WRITE,
  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 if (s_mapping == MAP_FAILED)
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 
99e61f8a85dc6cebbb3cd4183f8bd7fdfe9aa1de..5018bd23e6ec457a8e4790d333752b7e009228e0
 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -2214,8 +2214,9 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info 
stmt_info,
 we can end up with no gap recorded but still excess
 elements accessed, see PR103116.  Make sure we peel for
 gaps if necessary and sufficient and give up if not.
-If there is a combination of the access not covering the full 
vector and
-a gap recorded then we may need to peel twice.  */
+
+If there is a combination of the access not covering the full
+vector and a gap recorded then we may need to peel twice.  */
  if (loop_vinfo
  && *memory_access_type == VMAT_CONTIGUOUS
  && SLP_TREE_LOAD_PERMUTATION (slp_node).exists ()





Re: [PATCH] Reduce floating-point difficulties in timevar.cc

2023-07-21 Thread Matthew Malcomson via Gcc-patches

On 7/21/23 14:45, Xi Ruoyao wrote:

On Fri, 2023-07-21 at 14:11 +0100, Matthew Malcomson wrote:

My understanding is that this is not a hardware bug and that it's
specified that rounding does not happen on the multiply "sub-part" in
`FNMSUB`, but rounding happens on the `FMUL` that generates some input
to it.


AFAIK the C standard does only say "A floating *expression* may be
contracted".  I.e:

double r = a * b + c;

may be compiled to use FMA because "a * b + c" is a floating point
expression.  But

double t = a * b;
double r = t + c;

is not, because "a * b" and "t + c" are two separate floating point
expressions.

So a contraction across two functions is not allowed.  We now have -ffp-
contract=on (https://gcc.gnu.org/r14-2023) to only allow C-standard
contractions.

Perhaps -ffp-contract=on (not off) is enough to fix the issue (if you
are building GCC 14 snapshot).  The default is "fast" (if no -std=
option is used), which allows some contractions disallowed by the
standard.

But GCC is in C++ and I'm not sure if the C++ standard has the same
definition for allowed contractions as C.



Thanks -- I'll look into whether `-ffp-contract=on` works.



It's possible that the test itself is flaky.  Can you provide some
detail about how it fails?



Sure -- The outline is that `timer::validate_phases` sees the sum of 
sub-part timers as greater than the timer for the "overall" time 
(outside of a tolerance of 1.01).  It then complains and hits 
`gcc_unreachable()`.


While I found it difficult to get enough information out of the test 
that is run in the testsuite, I found that if passing an invalid 
argument to `cc1plus` all sub-parts would be zero, and sometimes the 
"total" would be negative.


This was due to the `times` syscall returning the same clock tick for 
start and end of the "total" timer and the difference in rounding 
between FNMSUB and FMUL means that depending on what that clock tick is 
the "elapsed time" can end up calculated as negative.


I didn't proove it 100% but I believe the same fundamental difference 
(but opposite rounding error) could trigger the testsuite failure -- if 
the "end" of one sub-phase timer is greater than the "start" of another 
sub-phase timer then sum of parts could be greater than total.


There is a "tolerance" in this test that I considered increasing, but 
since that would not affect the "invalid arguments" thing (where the 
total is negative and hence the tolerance multiplication of 1.01 
would have to be supplemented by a positive offset) I suggested avoiding 
the inline.


W.r.t. the x86 bug that Alexander Monakov has pointed to, it's a very 
similar thing but in this case the problem is not bit-precision of 
values after the inlining, but rather a difference between fused and not 
fused operations after the inlining.


Agreed that using integral arithmetic is the more robust solution.


Re: [PATCH] Reduce floating-point difficulties in timevar.cc

2023-07-21 Thread Matthew Malcomson via Gcc-patches

Responding to two emails at the same time ;-)

On 7/21/23 13:47, Richard Biener wrote:

On Fri, 21 Jul 2023, Matthew Malcomson wrote:


On some AArch64 bootstrapped builds, we were getting a flaky test
because the floating point operations in `get_time` were being fused
with the floating point operations in `timevar_accumulate`.

This meant that the rounding behaviour of our multiplication with
`ticks_to_msec` was different when used in `timer::start` and when
performed in `timer::stop`.  These extra inaccuracies led to the
testcase `g++.dg/ext/timevar1.C` being flaky on some hardware.

This change ensures those operations are not fused and hence stops the test
being flaky on that particular machine.  There is no expected change in the
generated code.
Bootstrap & regtest on AArch64 passes with no regressions.


I think this is undesriable.  With fused you mean we use FMA?
I think you could use -ffp-contract=off for the TU instead.


Yeah -- we used fused multiply subtract because we combined the multiply 
in `get_time` with the subtract in `timevar_accumulate`.




Note you can't use __attribute__((noinline)) literally since the
host compiler might not support this.

Richard.



On 7/21/23 13:49, Xi Ruoyao wrote:
...

I don't think it's correct.  It will break bootstrapping GCC from other
ISO C++11 compilers, you need to at least guard it with #ifdef __GNUC__.
And IMO it's just hiding the real problem.

We need more info of the "particular machine".  Is this a hardware bug
(i.e. the machine violates the AArch64 spec) or a GCC code generation
issue?  Or should we generally use -ffp-contract=off in BOOT_CFLAGS?



My understanding is that this is not a hardware bug and that it's 
specified that rounding does not happen on the multiply "sub-part" in 
`FNMSUB`, but rounding happens on the `FMUL` that generates some input 
to it.


I was given to understand from discussions with others that this codegen 
is allowed -- though I honestly didn't confirm the line of reasoning 
through all the relevant standards.




W.r.t. both:
Thanks for pointing out bootstrapping from other ISO C++ compilers -- 
(didn't realise that was a concern).


I can look into `-ffp-contract=off` as you both have recommended.
One question -- if we have concerns that the host compiler may not be 
able to handle `attribute((noinline))` would we also be concerned that 
this flag may not be supported?
(Or is the severity of lack of support sufficiently different in the two 
cases that this is fine -- i.e. not compile vs may trigger floating 
point rounding inaccuracies?)





[PATCH] Reduce floating-point difficulties in timevar.cc

2023-07-21 Thread Matthew Malcomson via Gcc-patches
On some AArch64 bootstrapped builds, we were getting a flaky test
because the floating point operations in `get_time` were being fused
with the floating point operations in `timevar_accumulate`.

This meant that the rounding behaviour of our multiplication with
`ticks_to_msec` was different when used in `timer::start` and when
performed in `timer::stop`.  These extra inaccuracies led to the
testcase `g++.dg/ext/timevar1.C` being flaky on some hardware.

This change ensures those operations are not fused and hence stops the test
being flaky on that particular machine.  There is no expected change in the
generated code.
Bootstrap & regtest on AArch64 passes with no regressions.

gcc/ChangeLog:

* timevar.cc (get_time): Make this noinline to avoid fusing
behaviour and associated test flakyness.


N.b. I didn't know who to include as reviewer -- guessed Richard Biener as the
global reviewer that had the most contributions to this file and Richard
Sandiford since I've asked him for reviews a lot in the past.


### Attachment also inlined for ease of reply###


diff --git a/gcc/timevar.cc b/gcc/timevar.cc
index 
d695297aae7f6b2a6de01a37fe86c2a232338df0..5ea4ec259e114f31f611e7105cd102f4c9552d18
 100644
--- a/gcc/timevar.cc
+++ b/gcc/timevar.cc
@@ -212,6 +212,7 @@ timer::named_items::print (FILE *fp, const timevar_time_def 
*total)
HAVE_WALL_TIME macros.  */
 
 static void
+__attribute__((noinline))
 get_time (struct timevar_time_def *now)
 {
   now->user = 0;



diff --git a/gcc/timevar.cc b/gcc/timevar.cc
index 
d695297aae7f6b2a6de01a37fe86c2a232338df0..5ea4ec259e114f31f611e7105cd102f4c9552d18
 100644
--- a/gcc/timevar.cc
+++ b/gcc/timevar.cc
@@ -212,6 +212,7 @@ timer::named_items::print (FILE *fp, const timevar_time_def 
*total)
HAVE_WALL_TIME macros.  */
 
 static void
+__attribute__((noinline))
 get_time (struct timevar_time_def *now)
 {
   now->user = 0;





Re: vectorizer: Avoid an OOB access from vectorization

2023-07-18 Thread Matthew Malcomson via Gcc-patches
Tamar pointed out it would be good to have a `scan-tree-dump` in the testcase
just to make sure that when something is currently vectorizing it stays
vectorizing (and hence that the new code is still likely running).

Attached patch has that change, also inlined for ease of reply.

--
> Our checks for whether the vectorization of a given loop would make an
> out of bounds access miss the case when the vector we load is so large
> as to span multiple iterations worth of data (while only being there to
> implement a single iteration).
> 
> This patch adds a check for such an access.
> 
> Example where this was going wrong (smaller version of testcase added):
> 
> ```
>   extern unsigned short multi_array[5][16][16];
>   extern void initialise_s(int *);
>   extern int get_sval();
> 
>   void foo() {
> int s0 = get_sval();
> int s[31];
> int i,j;
> initialise_s([0]);
> s0 = get_sval();
> for (j=0; j < 16; j++)
>   for (i=0; i < 16; i++)
>   multi_array[1][j][i]=s[j*2];
>   }
> ```
> 
> With the above loop we would load the `s[j*2]` integer into a 4 element
> vector, which reads 3 extra elements than the scalar loop would.
> `get_group_load_store_type` identifies that the loop requires a scalar
> epilogue due to gaps.  However we do not identify that the above code
> requires *two* scalar loops to be peeled due to the fact that each
> iteration loads an amount of data from the *next* iteration (while not
> using it).
> 
> Bootstrapped and regtested on aarch64-none-linux-gnu.
> N.b. out of interest we came across this working with Morello.
> 
> 


### Attachment also inlined for ease of reply###


diff --git a/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c 
b/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c
new file mode 100644
index 
..1aab4c5a14d1e8346d89587bd9544a1516535a45
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c
@@ -0,0 +1,61 @@
+/* For some targets we end up vectorizing the below loop such that the `sp`
+   single integer is loaded into a 4 integer vector.
+   While the writes are all safe, without 2 scalar loops being peeled into the
+   epilogue we would read past the end of the 31 integer array.  This happens
+   because we load a 4 integer chunk to only use the first integer and
+   increment by 2 integers at a time, hence the last load needs s[30-33] and
+   the penultimate load needs s[28-31].
+   This testcase ensures that we do not crash due to that behaviour.  */
+/* { dg-require-effective-target mmap } */
+#include 
+#include 
+
+#define MMAP_SIZE 0x2
+#define ADDRESS 0x112200
+
+#define MB_BLOCK_SIZE 16
+#define VERT_PRED_16 0
+#define HOR_PRED_16 1
+#define DC_PRED_16 2
+int *sptr;
+extern void intrapred_luma_16x16();
+unsigned short mprr_2[5][16][16];
+void initialise_s(int *s) { }
+int main() {
+void *s_mapping;
+void *end_s;
+s_mapping = mmap ((void *)ADDRESS, MMAP_SIZE, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+if (s_mapping == MAP_FAILED)
+  {
+   perror ("mmap");
+   return 1;
+  }
+end_s = (s_mapping + MMAP_SIZE);
+sptr = (int*)(end_s - sizeof(int[31]));
+intrapred_luma_16x16(sptr);
+return 0;
+}
+
+void intrapred_luma_16x16(int * restrict sp) {
+for (int j=0; j < MB_BLOCK_SIZE; j++)
+  {
+   mprr_2[VERT_PRED_16][j][0]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][1]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][2]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][3]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][4]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][5]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][6]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][7]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][8]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][9]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][10]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][11]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][12]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][13]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][14]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][15]=sp[j*2];
+  }
+}
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" {target vect_int } } } 
*/
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 
c08d0ef951fc63adcfffc601917134ddf51ece45..1c8c6784cc7b5f2d327339ff55a5a5ea08835aab
 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -2217,7 +2217,9 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info 
stmt_info,
 but the access in the loop doesn't cover the full vector
 we can end up with no gap recorded but still excess
 elements accessed, see PR103116.  Make sure we peel for
-gaps if necessary and sufficient and give up if not.  */
+gaps if necessary and sufficient and give up if not.
+If there is a combination of the access not covering the full 
vector and
+a gap 

vectorizer: Avoid an OOB access from vectorization

2023-07-14 Thread Matthew Malcomson via Gcc-patches
Our checks for whether the vectorization of a given loop would make an
out of bounds access miss the case when the vector we load is so large
as to span multiple iterations worth of data (while only being there to
implement a single iteration).

This patch adds a check for such an access.

Example where this was going wrong (smaller version of testcase added):

```
  extern unsigned short multi_array[5][16][16];
  extern void initialise_s(int *);
  extern int get_sval();

  void foo() {
int s0 = get_sval();
int s[31];
int i,j;
initialise_s([0]);
s0 = get_sval();
for (j=0; j < 16; j++)
  for (i=0; i < 16; i++)
multi_array[1][j][i]=s[j*2];
  }
```

With the above loop we would load the `s[j*2]` integer into a 4 element
vector, which reads 3 extra elements than the scalar loop would.
`get_group_load_store_type` identifies that the loop requires a scalar
epilogue due to gaps.  However we do not identify that the above code
requires *two* scalar loops to be peeled due to the fact that each
iteration loads an amount of data from the *next* iteration (while not
using it).

Bootstrapped and regtested on aarch64-none-linux-gnu.
N.b. out of interest we came across this working with Morello.


### Attachment also inlined for ease of reply###


diff --git a/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c 
b/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c
new file mode 100644
index 
..1b721fd26cab8d5583b153dd6b28c914db870ec3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-multi-peel-gaps.c
@@ -0,0 +1,60 @@
+/* For some targets we end up vectorizing the below loop such that the `sp`
+   single integer is loaded into a 4 integer vector.
+   While the writes are all safe, without 2 scalar loops being peeled into the
+   epilogue we would read past the end of the 31 integer array.  This happens
+   because we load a 4 integer chunk to only use the first integer and
+   increment by 2 integers at a time, hence the last load needs s[30-33] and
+   the penultimate load needs s[28-31].
+   This testcase ensures that we do not crash due to that behaviour.  */
+/* { dg-require-effective-target mmap } */
+#include 
+#include 
+
+#define MMAP_SIZE 0x2
+#define ADDRESS 0x112200
+
+#define MB_BLOCK_SIZE 16
+#define VERT_PRED_16 0
+#define HOR_PRED_16 1
+#define DC_PRED_16 2
+int *sptr;
+extern void intrapred_luma_16x16();
+unsigned short mprr_2[5][16][16];
+void initialise_s(int *s) { }
+int main() {
+void *s_mapping;
+void *end_s;
+s_mapping = mmap ((void *)ADDRESS, MMAP_SIZE, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+if (s_mapping == MAP_FAILED)
+  {
+   perror ("mmap");
+   return 1;
+  }
+end_s = (s_mapping + MMAP_SIZE);
+sptr = (int*)(end_s - sizeof(int[31]));
+intrapred_luma_16x16(sptr);
+return 0;
+}
+
+void intrapred_luma_16x16(int * restrict sp) {
+for (int j=0; j < MB_BLOCK_SIZE; j++)
+  {
+   mprr_2[VERT_PRED_16][j][0]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][1]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][2]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][3]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][4]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][5]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][6]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][7]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][8]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][9]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][10]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][11]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][12]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][13]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][14]=sp[j*2];
+   mprr_2[VERT_PRED_16][j][15]=sp[j*2];
+  }
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 
c08d0ef951fc63adcfffc601917134ddf51ece45..1c8c6784cc7b5f2d327339ff55a5a5ea08835aab
 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -2217,7 +2217,9 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info 
stmt_info,
 but the access in the loop doesn't cover the full vector
 we can end up with no gap recorded but still excess
 elements accessed, see PR103116.  Make sure we peel for
-gaps if necessary and sufficient and give up if not.  */
+gaps if necessary and sufficient and give up if not.
+If there is a combination of the access not covering the full 
vector and
+a gap recorded then we may need to peel twice.  */
  if (loop_vinfo
  && *memory_access_type == VMAT_CONTIGUOUS
  && SLP_TREE_LOAD_PERMUTATION (slp_node).exists ()
@@ -2233,7 +2235,7 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info 
stmt_info,
 access excess elements.
 ???  Enhancements include peeling multiple iterations
 or using masked loads with a static mask.  */
- 

docs: Fix wording describing the hwaddress sanitizer

2021-01-04 Thread Matthew Malcomson via Gcc-patches
The original documentation added to mention the clash between
-fsanitize=address and -fsanitize=hwaddress used confusing wording trying to
say that -fsanitize=hwaddress is only available on AArch64.

It read as if -fsanitize=address were only supported on AArch64.

This patch fixes that wording by being more explicit.

gcc/ChangeLog:

PR other/98437
* doc/invoke.texi (-fsanitize=address): Fix wording describing
clash with -fsanitize=hwaddress.



### Attachment also inlined for ease of reply###


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 
97aea535d900c0145340cc7af9141097ca1cc492..b378e63d6bce12d4dd9293e4c88039298daac9e8
 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14672,8 +14672,8 @@ the available options are shown at startup of the 
instrumented program.  See
 
@url{https://github.com/google/sanitizers/wiki/AddressSanitizerFlags#run-time-flags}
 for a list of supported options.
 The option cannot be combined with @option{-fsanitize=thread} or
-@option{-fsanitize=hwaddress}.  Note that the only target this option is
-currently supported on is AArch64.
+@option{-fsanitize=hwaddress}.  Note that the only target
+@option{-fsanitize=hwaddress} is currently supported on is AArch64.
 
 @item -fsanitize=kernel-address
 @opindex fsanitize=kernel-address

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 
97aea535d900c0145340cc7af9141097ca1cc492..b378e63d6bce12d4dd9293e4c88039298daac9e8
 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14672,8 +14672,8 @@ the available options are shown at startup of the 
instrumented program.  See
 
@url{https://github.com/google/sanitizers/wiki/AddressSanitizerFlags#run-time-flags}
 for a list of supported options.
 The option cannot be combined with @option{-fsanitize=thread} or
-@option{-fsanitize=hwaddress}.  Note that the only target this option is
-currently supported on is AArch64.
+@option{-fsanitize=hwaddress}.  Note that the only target
+@option{-fsanitize=hwaddress} is currently supported on is AArch64.
 
 @item -fsanitize=kernel-address
 @opindex fsanitize=kernel-address



[wwwdocs] Document new Hardware-assisted AddressSanitizer

2020-12-14 Thread Matthew Malcomson via Gcc-patches
gcc-11/changes: Document new Hardware-assisted AddressSanitizer.

I have put it in the "general" section rather than AArch64 since the feature
does add a general framework, so I believe the news is interesting for people
interesting in architectures other than AArch64 that may want to implement this
for their own targets (where possible).


### Attachment also inlined for ease of reply###


diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
index 
46a6a37225526227fb6a46d9d3248d6ba9f36a07..0127a3f99f2ca897bf1c32e967d359fb9ace6f6b
 100644
--- a/htdocs/gcc-11/changes.html
+++ b/htdocs/gcc-11/changes.html
@@ -116,6 +116,24 @@ a work-in-progress.
   option (which defaults to 8 spaces per tab stop).
 
   
+  
+
+Introduce http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html;>
+  Hardware-assisted AddressSanitizer support.  This sanitizer 
currently only works
+for the AArch64 target.  It helps debug address problems similarly to
+https://github.com/google/sanitizers/wiki/AddressSanitizer;>
+  AddressSanitizer but is based on partial hardware assistance and
+provides probabilistic protection to use less RAM at runtime.
+http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html;>
+  Hardware-assisted AddressSanitizer is not production-ready for 
userspace, and is
+provided mainly for use compiling the Linux Kernel.
+
+To use this sanitizer the command line arguments are:
+
+  -fsanitize=hwaddress to instrument userspace code.
+  -fsanitize=kernel-hwaddress to instrument kernel 
code.
+
+  
 
 
 

diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
index 
46a6a37225526227fb6a46d9d3248d6ba9f36a07..0127a3f99f2ca897bf1c32e967d359fb9ace6f6b
 100644
--- a/htdocs/gcc-11/changes.html
+++ b/htdocs/gcc-11/changes.html
@@ -116,6 +116,24 @@ a work-in-progress.
   option (which defaults to 8 spaces per tab stop).
 
   
+  
+
+Introduce http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html;>
+  Hardware-assisted AddressSanitizer support.  This sanitizer 
currently only works
+for the AArch64 target.  It helps debug address problems similarly to
+https://github.com/google/sanitizers/wiki/AddressSanitizer;>
+  AddressSanitizer but is based on partial hardware assistance and
+provides probabilistic protection to use less RAM at runtime.
+http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html;>
+  Hardware-assisted AddressSanitizer is not production-ready for 
userspace, and is
+provided mainly for use compiling the Linux Kernel.
+
+To use this sanitizer the command line arguments are:
+
+  -fsanitize=hwaddress to instrument userspace code.
+  -fsanitize=kernel-hwaddress to instrument kernel 
code.
+
+  
 
 
 



[PATCH] testsuite: Correct check_effective_target_hwaddress_exec

2020-11-27 Thread Matthew Malcomson via Gcc-patches
Hello,

-

This test should ensure that we can compile with hwasan, that such a compiled
binary runs as expected, *and* that we're running on a kernel which implements
the capability to ignore the top bytes of pointers in syscalls.

It was expected that a basic test of `int main(void) { return 0; }` would check
this, since there is a check called during `__hwasan_init` in libhwasan to
ensure that the kernel we're running on provides a `prctl` to request the
relaxed ABI.

Unfortunately, the check in libhwasan has a bug in it, and does not correctly
fail when the kernel feature is not around.  This means that check is not
automatically provided by the runtime.

The sanitizer runtime will be fixed but would like to install this fix here in
case fixing the library is not quick enough for the release (and so that people
running the testsuite do not see spurious errors in the meantime).

Tested by running testsuite on an AArch64 machine with and without the required
kernel.
Observed that the test does indeed fail when the kernel feature is unavailable
and pass when the feature is available.

Note that this test is directly targetting AArch64 by using `prctl` numbers
specific to it.  That's unfortunate, but once the runtime fix has gone in we
will be able to remove that requirement.

Ok for trunk?

gcc/testsuite/ChangeLog:

* lib/hwasan-dg.exp (check_effective_target_hwaddress_exec): Fix
check for correct kernel version.



### Attachment also inlined for ease of reply###


diff --git a/gcc/testsuite/lib/hwasan-dg.exp b/gcc/testsuite/lib/hwasan-dg.exp
index 
892f2bab43325e830b5d9243377c70e074cdfe40..bd2a011947f9d3384ee32ffa9996a49429256af2
 100644
--- a/gcc/testsuite/lib/hwasan-dg.exp
+++ b/gcc/testsuite/lib/hwasan-dg.exp
@@ -41,7 +41,24 @@ proc check_effective_target_fsanitize_hwaddress {} {
 
 proc check_effective_target_hwaddress_exec {} {
 if ![check_runtime hwaddress_exec {
-   int main (void) { return 0; }
+   #ifdef __cplusplus
+   extern "C" {
+   #endif
+   extern int prctl(int, unsigned long, unsigned long, unsigned long, 
unsigned long);
+   #ifdef __cplusplus
+   }
+   #endif
+   int main (void) {
+   #define PR_SET_TAGGED_ADDR_CTRL 55
+   #define PR_GET_TAGGED_ADDR_CTRL 56
+   #define PR_TAGGED_ADDR_ENABLE (1UL << 0)
+ if (prctl (PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0) == -1)
+   return 1;
+ if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE, 0, 0, 0) == 
-1
+ || !prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0))
+   return 1;
+ return 0;
+   }
 }] {
return 0;
 }

diff --git a/gcc/testsuite/lib/hwasan-dg.exp b/gcc/testsuite/lib/hwasan-dg.exp
index 
892f2bab43325e830b5d9243377c70e074cdfe40..bd2a011947f9d3384ee32ffa9996a49429256af2
 100644
--- a/gcc/testsuite/lib/hwasan-dg.exp
+++ b/gcc/testsuite/lib/hwasan-dg.exp
@@ -41,7 +41,24 @@ proc check_effective_target_fsanitize_hwaddress {} {
 
 proc check_effective_target_hwaddress_exec {} {
 if ![check_runtime hwaddress_exec {
-   int main (void) { return 0; }
+   #ifdef __cplusplus
+   extern "C" {
+   #endif
+   extern int prctl(int, unsigned long, unsigned long, unsigned long, 
unsigned long);
+   #ifdef __cplusplus
+   }
+   #endif
+   int main (void) {
+   #define PR_SET_TAGGED_ADDR_CTRL 55
+   #define PR_GET_TAGGED_ADDR_CTRL 56
+   #define PR_TAGGED_ADDR_ENABLE (1UL << 0)
+ if (prctl (PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0) == -1)
+   return 1;
+ if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE, 0, 0, 0) == 
-1
+ || !prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0))
+   return 1;
+ return 0;
+   }
 }] {
return 0;
 }



Re: Update: [PATCH 5/X] libsanitizer: mid-end: Introduce stack variable handling for HWASAN

2020-11-24 Thread Matthew Malcomson via Gcc-patches

On 24/11/2020 12:30, Hongtao Liu wrote:

Hi:
   I'm learning about this patch, and I see one place that might be
slighted improved.

+  poly_int64 size = (top - bot);
+
+  /* Assert the edge of each variable is aligned to the HWASAN tag granule
+size.  */
+  gcc_assert (multiple_p (top, HWASAN_TAG_GRANULE_SIZE));
+  gcc_assert (multiple_p (bot, HWASAN_TAG_GRANULE_SIZE));
+  gcc_assert (multiple_p (size, HWASAN_TAG_GRANULE_SIZE));
+

The last gcc_assert looks redundant?


Hi

I think you're right.

Just FYI I'm planning on making that change as an obvious fix after the 
patchset as it is now goes in.


That way I can say I ran all my tests on the patch series that I applied 
without going through the all the tests again.


Thanks for the catch!

Matthew


On Sat, Nov 21, 2020 at 2:48 AM Matthew Malcomson via Gcc-patches
 wrote:





Re: libsanitizer: Hwasan reporting check for dladdr failing

2020-11-24 Thread Matthew Malcomson via Gcc-patches

On 24/11/2020 15:53, Kyrylo Tkachov wrote:

Hi Matthew,


-Original Message-
From: Gcc-patches  On Behalf Of
Matthew Malcomson via Gcc-patches
Sent: 24 November 2020 15:47
To: gcc-patches@gcc.gnu.org
Cc: Richard Sandiford 
Subject: libsanitizer: Hwasan reporting check for dladdr failing

Hello there,

This is the compiler-rt patch I'd like to cherry-pick so that the hwasan tests
pass.

It is in LLVM as commit 83ac18205ec69a00ac2be3b603bc3a61293fbe89.

Ok for trunk?

Also is the libhwasan library merge from the below email OK for trunk?
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558999.html
(Note I would also add a line to README.gcc mentioning compiler-
rt/lib/hwasan
on top of the changes in that patch).

I would guess so, but wasn't certain the OK had ever been said anywhere.


I believe merges from an upstream are generally considered pre-approved. In any 
case, I see that merge committed as 98f792ff538109c71d85ab2a61461cd090f3b9f3

Thanks,
Kyrill


Thanks Kyrill,

For the record, 98f792ff538109c71d85ab2a61461cd090f3b9f3 is the merge 
without libhwasan part, while I was asking about the libhwasan part.


However, given the standard pre-approved merge from upstream status I 
guess it doesn't matter -- and that's good to know for the future too.


Thanks!
Matthew





Regards,
Matthew

-


In `GetGlobalSizeFromDescriptor` we use `dladdr` to get info on the the
current address.  `dladdr` returns 0 if it failed.
During testing on Linux this returned 0 to indicate failure, and
populated the `info` structure with a NULL pointer which was
dereferenced later.

This patch checks for `dladdr` returning 0, and in that case returns 0
from `GetGlobalSizeFromDescriptor` to indicate failure of identifying
the address.

This occurs when `GetModuleNameAndOffsetForPC` succeeds for some
address
not in a dynamically loaded library.  One example is when the found
"module" is '[stack]' having come from parsing /proc/self/maps.

Cherry-pick from 83ac18205ec69a00ac2be3b603bc3a61293fbe89.

Differential Revision: https://reviews.llvm.org/D91344


### Attachment also inlined for ease of reply
###


diff --git a/libsanitizer/hwasan/hwasan_report.cpp
b/libsanitizer/hwasan/hwasan_report.cpp
index
0be7deeaee1a0bd523d9e0fe1dc3b1311b3920e2..894a149775f291bae9cad8
33b1ac54914212f405 100644
--- a/libsanitizer/hwasan/hwasan_report.cpp
+++ b/libsanitizer/hwasan/hwasan_report.cpp
@@ -254,7 +254,8 @@ static bool TagsEqual(tag_t tag, tag_t *tag_ptr) {
  static uptr GetGlobalSizeFromDescriptor(uptr ptr) {
// Find the ELF object that this global resides in.
Dl_info info;
-  dladdr(reinterpret_cast(ptr), );
+  if (dladdr(reinterpret_cast(ptr), ) == 0)
+return 0;
auto *ehdr = reinterpret_cast(info.dli_fbase);
auto *phdr_begin = reinterpret_cast(
reinterpret_cast(ehdr) + ehdr->e_phoff);






libsanitizer: Hwasan reporting check for dladdr failing

2020-11-24 Thread Matthew Malcomson via Gcc-patches
Hello there,

This is the compiler-rt patch I'd like to cherry-pick so that the hwasan tests
pass.

It is in LLVM as commit 83ac18205ec69a00ac2be3b603bc3a61293fbe89.

Ok for trunk?

Also is the libhwasan library merge from the below email OK for trunk?
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558999.html
(Note I would also add a line to README.gcc mentioning compiler-rt/lib/hwasan
on top of the changes in that patch).

I would guess so, but wasn't certain the OK had ever been said anywhere.

Regards,
Matthew

-


In `GetGlobalSizeFromDescriptor` we use `dladdr` to get info on the the
current address.  `dladdr` returns 0 if it failed.
During testing on Linux this returned 0 to indicate failure, and
populated the `info` structure with a NULL pointer which was
dereferenced later.

This patch checks for `dladdr` returning 0, and in that case returns 0
from `GetGlobalSizeFromDescriptor` to indicate failure of identifying
the address.

This occurs when `GetModuleNameAndOffsetForPC` succeeds for some address
not in a dynamically loaded library.  One example is when the found
"module" is '[stack]' having come from parsing /proc/self/maps.

Cherry-pick from 83ac18205ec69a00ac2be3b603bc3a61293fbe89.

Differential Revision: https://reviews.llvm.org/D91344


### Attachment also inlined for ease of reply###


diff --git a/libsanitizer/hwasan/hwasan_report.cpp 
b/libsanitizer/hwasan/hwasan_report.cpp
index 
0be7deeaee1a0bd523d9e0fe1dc3b1311b3920e2..894a149775f291bae9cad833b1ac54914212f405
 100644
--- a/libsanitizer/hwasan/hwasan_report.cpp
+++ b/libsanitizer/hwasan/hwasan_report.cpp
@@ -254,7 +254,8 @@ static bool TagsEqual(tag_t tag, tag_t *tag_ptr) {
 static uptr GetGlobalSizeFromDescriptor(uptr ptr) {
   // Find the ELF object that this global resides in.
   Dl_info info;
-  dladdr(reinterpret_cast(ptr), );
+  if (dladdr(reinterpret_cast(ptr), ) == 0)
+return 0;
   auto *ehdr = reinterpret_cast(info.dli_fbase);
   auto *phdr_begin = reinterpret_cast(
   reinterpret_cast(ehdr) + ehdr->e_phoff);

diff --git a/libsanitizer/hwasan/hwasan_report.cpp 
b/libsanitizer/hwasan/hwasan_report.cpp
index 
0be7deeaee1a0bd523d9e0fe1dc3b1311b3920e2..894a149775f291bae9cad833b1ac54914212f405
 100644
--- a/libsanitizer/hwasan/hwasan_report.cpp
+++ b/libsanitizer/hwasan/hwasan_report.cpp
@@ -254,7 +254,8 @@ static bool TagsEqual(tag_t tag, tag_t *tag_ptr) {
 static uptr GetGlobalSizeFromDescriptor(uptr ptr) {
   // Find the ELF object that this global resides in.
   Dl_info info;
-  dladdr(reinterpret_cast(ptr), );
+  if (dladdr(reinterpret_cast(ptr), ) == 0)
+return 0;
   auto *ehdr = reinterpret_cast(info.dli_fbase);
   auto *phdr_begin = reinterpret_cast(
   reinterpret_cast(ehdr) + ehdr->e_phoff);



Re: [PATCH 7/X] libsanitizer: Add tests

2020-11-23 Thread Matthew Malcomson via Gcc-patches

Hello,

Update of the hwasan tests attached.

Ok for trunk?

(NOTE on the state of the entire patch series:
  Currently re-testing everything from scratch to ensure I didn't get tripped up
  from old state carried around.
  Also looking into PR 97941 on hwasan.  As yet don't understand what's going
  on there and having difficulty reproducing.)




Adding hwasan tests.

Only interesting thing here is that we have to make sure the tagging mechanism
is deterministic to avoid flaky tests.

gcc/testsuite/ChangeLog:

* c-c++-common/ubsan/sanitize-recover-7.c: Update error message format.
* lib/asan-dg.exp (check_effective_target_fsanitize_address): Move to
target-supports.exp.
(asan_link_flags): Implement as a helper function asan_link_flags_1
which asan_link_flags and hwasan_link_flags use.
(asan_link_flags_1): Parametrised version of asan_link_flags.
* lib/target-supports.exp (check_effective_target_fsanitize_address):
Move from asan-dg.exp.
(check_effective_target_fsanitize_hwaddress,
check_effective_target_hwaddress_exec): New.
* c-c++-common/hwasan/aligned-alloc.c: New test.
* c-c++-common/hwasan/alloca-array-accessible.c: New test.
* c-c++-common/hwasan/alloca-base-init.c: New test.
* c-c++-common/hwasan/alloca-gets-different-tag.c: New test.
* c-c++-common/hwasan/alloca-outside-caught.c: New test.
* c-c++-common/hwasan/arguments-1.c: New test.
* c-c++-common/hwasan/arguments-2.c: New test.
* c-c++-common/hwasan/arguments-3.c: New test.
* c-c++-common/hwasan/arguments.c: New test.
* c-c++-common/hwasan/asan-pr63316.c: New test.
* c-c++-common/hwasan/asan-pr70541.c: New test.
* c-c++-common/hwasan/asan-pr78106.c: New test.
* c-c++-common/hwasan/asan-pr79944.c: New test.
* c-c++-common/hwasan/asan-rlimit-mmap-test-1.c: New test.
* c-c++-common/hwasan/bitfield-1.c: New test.
* c-c++-common/hwasan/bitfield-2.c: New test.
* c-c++-common/hwasan/builtin-special-handling.c: New test.
* c-c++-common/hwasan/check-interface.c: New test.
* c-c++-common/hwasan/halt_on_error-1.c: New test.
* c-c++-common/hwasan/handles-poly_int-marked-vars.c: New test.
* c-c++-common/hwasan/heap-overflow.c: New test.
* c-c++-common/hwasan/hwasan-poison-optimisation.c: New test.
* c-c++-common/hwasan/hwasan-thread-access-parent.c: New test.
* c-c++-common/hwasan/hwasan-thread-basic-failure.c: New test.
* c-c++-common/hwasan/hwasan-thread-clears-stack.c: New test.
* c-c++-common/hwasan/hwasan-thread-success.c: New test.
* c-c++-common/hwasan/kernel-defaults.c: New test.
* c-c++-common/hwasan/large-aligned-0.c: New test.
* c-c++-common/hwasan/large-aligned-1.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-0.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-1.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-2.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-3.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-4.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-5.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-6.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-7.c: New test.
* c-c++-common/hwasan/macro-definition.c: New test.
* c-c++-common/hwasan/no-sanitize-attribute.c: New test.
* c-c++-common/hwasan/param-instrument-mem-intrinsics.c: New test.
* c-c++-common/hwasan/param-instrument-reads-and-writes.c: New test.
* c-c++-common/hwasan/param-instrument-reads.c: New test.
* c-c++-common/hwasan/param-instrument-writes.c: New test.
* c-c++-common/hwasan/random-frame-tag.c: New test.
* c-c++-common/hwasan/sanity-check-pure-c.c: New test.
* c-c++-common/hwasan/setjmp-longjmp-0.c: New test.
* c-c++-common/hwasan/setjmp-longjmp-1.c: New test.
* c-c++-common/hwasan/stack-tagging-basic-0.c: New test.
* c-c++-common/hwasan/stack-tagging-basic-1.c: New test.
* c-c++-common/hwasan/stack-tagging-disable.c: New test.
* c-c++-common/hwasan/unprotected-allocas-0.c: New test.
* c-c++-common/hwasan/unprotected-allocas-1.c: New test.
* c-c++-common/hwasan/use-after-free.c: New test.
* c-c++-common/hwasan/vararray-outside-caught.c: New test.
* c-c++-common/hwasan/vararray-stack-restore-correct.c: New test.
* c-c++-common/hwasan/very-large-objects.c: New test.
* g++.dg/hwasan/hwasan.exp: New test.
* g++.dg/hwasan/rvo-handled.C: New test.
* gcc.dg/hwasan/hwasan.exp: New test.
* gcc.dg/hwasan/nested-functions-0.c: New test.
* gcc.dg/hwasan/nested-functions-1.c: New test.
* 

Re: [PATCH 4/X] libsanitizer: options: Add hwasan flags and argument parsing

2020-11-20 Thread Matthew Malcomson via Gcc-patches
Hi there,

I was just doing some double-checks and noticed I'd placed the
documentation in the wrong section of tm.texi.  The `MEMTAG` hooks were
documented in the `Register Classes` section, so I've now moved it to
the `Misc` section.

That's the only change, Ok for trunk?

Matthew



These flags can't be used at the same time as any of the other
sanitizers.
We add an equivalent flag to -static-libasan in -static-libhwasan to
ensure static linking.

The -fsanitize=kernel-hwaddress option is for compiling targeting the
kernel.  This flag has defaults to match the LLVM implementation and
sets some other behaviors to work in the kernel (e.g. accounting for
the fact that the stack pointer will have 0xff in the top byte and to not
call the userspace library initialisation routines).
The defaults are that we do not sanitize variables on the stack and
always recover from a detected bug.

Since we are introducing a few more conflicts between sanitizer flags we
refactor the checking for such conflicts to use a helper function which
makes checking for such conflicts more easy and consistent.

We introduce a backend hook `targetm.memtag.can_tag_addresses` that
indicates to the mid-end whether a target has a feature like AArch64 TBI
where the top byte of an address is ignored.
Without this feature hwasan sanitization is not done.

gcc/ChangeLog:

* common.opt (flag_sanitize_recover): Default for kernel
hwaddress.
(static-libhwasan): New cli option.
* config/aarch64/aarch64.c (aarch64_can_tag_addresses): New.
(TARGET_MEMTAG_CAN_TAG_ADDRESSES): New.
* config/gnu-user.h (LIBHWASAN_EARLY_SPEC): hwasan equivalent of
asan command line flags.
* cppbuiltin.c (define_builtin_macros_for_compilation_flags):
Add hwasan equivalent of __SANITIZE_ADDRESS__.
* doc/invoke.texi: Document hwasan command line flags.
* doc/tm.texi: Document new hook.
* doc/tm.texi.in: Document new hook.
* flag-types.h (enum sanitize_code): New sanitizer values.
* gcc.c (STATIC_LIBHWASAN_LIBS): New macro.
(LIBHWASAN_SPEC): New macro.
(LIBHWASAN_EARLY_SPEC): New macro.
(SANITIZER_EARLY_SPEC): Update to include hwasan.
(SANITIZER_SPEC): Update to include hwasan.
(sanitize_spec_function): Use hwasan options.
* opts.c (finish_options): Describe conflicts between address
sanitizers.
(find_sanitizer_argument): New.
(report_conflicting_sanitizer_options): New.
(sanitizer_opts): Introduce new sanitizer flags.
(common_handle_option): Add defaults for kernel sanitizer.
* params.opt (hwasan--instrument-stack): New
(hwasan-random-frame-tag): New
(hwasan-instrument-allocas): New
(hwasan-instrument-reads): New
(hwasan-instrument-writes): New
(hwasan-instrument-mem-intrinsics): New
* target.def (HOOK_PREFIX): Add new hook.
(can_tag_addresses): Add new hook under memtag prefix.
* targhooks.c (default_memtag_can_tag_addresses): New.
* targhooks.h (default_memtag_can_tag_addresses): New decl.
* toplev.c (process_options): Ensure hwasan only on
architectures that advertise the possibility.



### Attachment also inlined for ease of reply###


diff --git a/gcc/common.opt b/gcc/common.opt
index 
fe39b3dee9f270dd39b3f69ff6a0e2e854058703..4e4ba790ce668e490e35c2d95a0b12472754fba4
 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -218,7 +218,7 @@ unsigned int flag_sanitize
 
 ; What sanitizers should recover from errors
 Variable
-unsigned int flag_sanitize_recover = (SANITIZE_UNDEFINED | 
SANITIZE_UNDEFINED_NONDEFAULT | SANITIZE_KERNEL_ADDRESS) & 
~(SANITIZE_UNREACHABLE | SANITIZE_RETURN)
+unsigned int flag_sanitize_recover = (SANITIZE_UNDEFINED | 
SANITIZE_UNDEFINED_NONDEFAULT | SANITIZE_KERNEL_ADDRESS | 
SANITIZE_KERNEL_HWADDRESS) & ~(SANITIZE_UNREACHABLE | SANITIZE_RETURN)
 
 ; What the coverage sanitizers should instrument
 Variable
@@ -3445,6 +3445,9 @@ Driver
 static-libasan
 Driver
 
+static-libhwasan
+Driver
+
 static-libtsan
 Driver
 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
6de51b521bacb0530799c7cbddb5f6b170bf441c..4f90a49f1b79db406319ddbf89e42f58a695f430
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -23294,6 +23294,15 @@ aarch64_invalid_binary_op (int op ATTRIBUTE_UNUSED, 
const_tree type1,
   return NULL;
 }
 
+/* Implement TARGET_MEMTAG_CAN_TAG_ADDRESSES.  Here we tell the rest of the
+   compiler that we automatically ignore the top byte of our pointers, which
+   allows using -fsanitize=hwaddress.  */
+bool
+aarch64_can_tag_addresses ()
+{
+  return !TARGET_ILP32;
+}
+
 /* Implement TARGET_ASM_FILE_END for AArch64.  This adds the AArch64 GNU NOTE
section at the end if needed.  */
 #define 

Re: Update: [PATCH 5/X] libsanitizer: mid-end: Introduce stack variable handling for HWASAN

2020-11-20 Thread Matthew Malcomson via Gcc-patches


Hi there,

I was just doing some double-checks and noticed I'd placed the
documentation in the wrong section of tm.texi.  The `MEMTAG` hooks were
documented in the `Register Classes` section, so I've now moved it to
the `Misc` section.

That's the only change, Ok for trunk?

Matthew






Handling stack variables has three features.

1) Ensure HWASAN required alignment for stack variables

When tagging shadow memory, we need to ensure that each tag granule is
only used by one variable at a time.

This is done by ensuring that each tagged variable is aligned to the tag
granule representation size and also ensure that the end of each
object is aligned to ensure the start of any other data stored on the
stack is in a different granule.

This patch ensures the above by forcing the stack pointer to be aligned
before and after allocating any stack objects. Since we are forcing
alignment we also use `align_local_variable` to ensure this new alignment
is advertised properly through SET_DECL_ALIGN.

2) Put tags into each stack variable pointer

Make sure that every pointer to a stack variable includes a tag of some
sort on it.

The way tagging works is:
  1) For every new stack frame, a random tag is generated.
  2) A base register is formed from the stack pointer value and this
 random tag.
  3) References to stack variables are now formed with RTL describing an
 offset from this base in both tag and value.

The random tag generation is handled by a backend hook.  This hook
decides whether to introduce a random tag or use the stack background
based on the parameter hwasan-random-frame-tag.  Using the stack
background is necessary for testing and bootstrap.  It is necessary
during bootstrap to avoid breaking the `configure` test program for
determining stack direction.

Using the stack background means that every stack frame has the initial
tag of zero and variables are tagged with incrementing tags from 1,
which also makes debugging a bit easier.

Backend hooks define the size of a tag, the layout of the HWASAN shadow
memory, and handle emitting the code that inserts and extracts tags from a
pointer.

3) For each stack variable, tag and untag the shadow stack on function
   prologue and epilogue.

On entry to each function we tag the relevant shadow stack region for
each stack variable. This stack region is tagged to match the tag added to
each pointer to that variable.

This is the first patch where we use the HWASAN shadow space, so we need
to add in the libhwasan initialisation code that creates this shadow
memory region into the binary we produce.  This instrumentation is done
in `compile_file`.

When exiting a function we need to ensure the shadow stack for this
function has no remaining tags.  Without clearing the shadow stack area
for this stack frame, later function calls could get false positives
when those later function calls check untagged areas (such as parameters
passed on the stack) against a shadow stack area with left-over tag.

Hence we ensure that the entire stack frame is cleared on function exit.

config/ChangeLog:

* bootstrap-hwasan.mk: Disable random frame tags for stack-tagging
during bootstrap.

ChangeLog:

* gcc/asan.c (struct hwasan_stack_var): New.
(hwasan_sanitize_p): New.
(hwasan_sanitize_stack_p): New.
(hwasan_sanitize_allocas_p): New.
(initialize_sanitizer_builtins): Define new builtins.
(ATTR_NOTHROW_LIST): New macro.
(hwasan_current_frame_tag): New.
(hwasan_frame_base): New.
(stack_vars_base_reg_p): New.
(hwasan_maybe_init_frame_base_init): New.
(hwasan_record_stack_var): New.
(hwasan_get_frame_extent): New.
(hwasan_increment_frame_tag): New.
(hwasan_record_frame_init): New.
(hwasan_emit_prologue): New.
(hwasan_emit_untag_frame): New.
(hwasan_finish_file): New.
(hwasan_truncate_to_tag_size): New.
* gcc/asan.h (hwasan_record_frame_init): New declaration.
(hwasan_record_stack_var): New declaration.
(hwasan_emit_prologue): New declaration.
(hwasan_emit_untag_frame): New declaration.
(hwasan_get_frame_extent): New declaration.
(hwasan_maybe_enit_frame_base_init): New declaration.
(hwasan_frame_base): New declaration.
(stack_vars_base_reg_p): New declaration.
(hwasan_current_frame_tag): New declaration.
(hwasan_increment_frame_tag): New declaration.
(hwasan_truncate_to_tag_size): New declaration.
(hwasan_finish_file): New declaration.
(hwasan_sanitize_p): New declaration.
(hwasan_sanitize_stack_p): New declaration.
(hwasan_sanitize_allocas_p): New declaration.
(HWASAN_TAG_SIZE): New macro.
(HWASAN_TAG_GRANULE_SIZE): New macro.
(HWASAN_STACK_BACKGROUND): New macro.
* gcc/builtin-types.def 

Re: [Patch 0/X] HWASAN v4

2020-11-20 Thread Matthew Malcomson via Gcc-patches

On 13/11/2020 17:22, Martin Liška wrote:

On 11/13/20 5:57 PM, Matthew Malcomson wrote:

Hi there,

Thanks for the heads-up.
As it turns out the most recent `libhwasan` crashes when displaying an 
address on the stack in Linux.


Hello.

What a bad luck.



I'm currently working on getting it fixed here 
https://reviews.llvm.org/D91344#2393371 
<https://reviews.llvm.org/D91344#2393371> .
If this hwasan patch series gets approved and if that patch goes in 
would it be feasible to bump the libsanitizer merge to whatever 
version that would be?


If not (maybe because stage1 would be finished?) then could/would we 
end up using the LOCAL_PATCHES approach?


Since now, I would prefer doing cherry picks. Hopefully, we'll end just 
with couple of patches.


That makes sense, there's just one patch I need 83ac1820.

As far as I can tell from the history, the process is simply to apply 
the patch in GCC, commit it with a ChangeLog etc as if it were a GCC 
patch, and then add the hash into LOCAL_PATCHES as a separate commit.


Is that right?


Given that it looks like the hwasan patch series is nearing going in 
now, I'd like to make sure I know how the library is getting added.


Is the plan something like the below?
1) You add the libhwasan update (i.e. this patch you posted).
2) I add the cherry-pick from LLVM compiler-rt (once it's approved)
and a separate commit updating LOCAL_PATCHES.
3) I add the hwasan patch series.

Or would it make more sense for me to apply your patch below with 
`--author` using your details (so it goes in the ChangeLog that way)?


Thanks!
Matthew




Thanks,
Martin



Thanks,
Matthew
-- 


*From:* Martin Liška 
*Sent:* 13 November 2020 16:33
*To:* Matthew Malcomson ; 
gcc-patches@gcc.gnu.org 
*Cc:* ja...@redhat.com ; Richard Earnshaw 
; k...@google.com ; 
do...@redhat.com ; jos...@codesourcery.com 


*Subject:* Re: [Patch 0/X] HWASAN v4
On 10/16/20 11:03 AM, Martin Li�ka wrote:

Hello.

I've just merged libsanitizer and there's the corresponding part that 
includes

libhwasan.

Martin


Hey.

I've just made last merge from upstream, there's corresponding hwasan 
part.


Martin






Re: Document --with-build-config=bootstrap-asan option.

2020-11-20 Thread Matthew Malcomson via Gcc-patches

On 13/01/2020 10:40, Matthew Malcomson wrote:

On 11/01/2020 07:19, Gerald Pfeifer wrote:

On Thu, 12 Dec 2019, Matthew Malcomson wrote:

gcc/ChangeLog:

2019-12-12  Matthew Malcomson  

* doc/install.texi: Document bootstrap-asan configuration option.


I see this introduces a new table.


+Some examples of build configurations designed for developers of GCC are:


@samp{bootstrap-time}, @samp{bootstrap-debug-ckovw} and others appear
to fall into the same camp, essentially expected to be used by maintainers
only.

Would it make sense to add your new option to the existing table, or
perhaps see which other options from the existing table to move into
your new one?  Thoughts?


Sounds good me.





The patch is okay modulo the question above.



Hi,

Apologies for bringing up something from this far back, but I've just 
noticed I never actually committed this patch.


Did the above line mean approval once the option was moved into the 
existing table?  Or was it just explaining that the position was the 
only problem you saw?


In other words: is the patch I proposed below Ok for trunk?

Thanks,
Matthew



Thanks,
Gerald



Patch with above suggestion.

#

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index
80b47812fe66a8ef50edf3aad9708ab3409ba7dc..0705759c69f64c6d06e91f7ae83bb8c1ad210f34
100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2668,6 +2668,10 @@ Arranges for the run time of each program started
by the GCC driver,
   built in any stage, to be logged to @file{time.log}, in the top level of
   the build tree.

+@item @samp{bootstrap-asan}
+Compiles GCC itself using Address Sanitization in order to catch
invalid memory
+accesses within the GCC code.
+
   @end table

   @section Building a cross compiler





Re: Update [PATCH 6/X] libsanitizer: Add hwasan pass and associated gimple changes

2020-11-20 Thread Matthew Malcomson via Gcc-patches

Updates after latest review.
(testing underway)

---
There are four main features to this change:

1) Check pointer tags match address tags.

When sanitizing for hwasan we now put HWASAN_CHECK internal functions before
memory accesses in the `asan` pass.  This checks that a tag in the pointer
being used match the tag stored in shadow memory for the memory region being
used.

These internal functions are expanded into actual checks in the sanopt
pass that happens just before expansion into RTL.

We use the same mechanism that currently inserts ASAN_CHECK internal
functions to insert the new HWASAN_CHECK functions.

2) Instrument known builtin function calls.

Handle all builtin functions that we know use memory accesses.
This commit uses the machinery added for ASAN to identify builtin
functions that access memory.

The main differences between the approaches for HWASAN and ASAN are:
 - libhwasan intercepts much less builtin functions.
 - Alloca needs to be transformed differently (instead of adding
   redzones it needs to tag shadow memory and return a tagged pointer).
 - stack_restore needs to untag the shadow stack between the current
   position and where it's going.
 - `noreturn` functions can not be handled by simply unpoisoning the
   entire shadow stack -- there is no "always valid" tag.
   (exceptions and things such as longjmp need to be handled in a
   different way, usually in the runtime).

For hardware implemented checking (such as AArch64's memory tagging
extension) alloca and stack_restore will need to be handled by hooks in
the backend rather than transformation at the gimple level.  This will
allow architecture specific handling of such stack modifications.

3) Introduce HWASAN block-scope poisoning

Here we use exactly the same mechanism as ASAN_MARK to poison/unpoison
variables on entry/exit of a block.

In order to simply use the exact same machinery we're using the same
internal functions until the SANOPT pass.  This means that all handling
of ASAN_MARK is the same.
This has the negative that the naming may be a little confusing, but a
positive that handling of the internal function doesn't have to be
duplicated for a function that behaves exactly the same but has a
different name.

gcc/ChangeLog:

* asan.c (asan_instrument_reads): New.
(asan_instrument_writes): New.
(asan_memintrin): New.
(handle_builtin_stack_restore): Account for HWASAN.
(handle_builtin_alloca): Account for HWASAN.
(get_mem_refs_of_builtin_call): Special case strlen for HWASAN.
(hwasan_instrument_reads): New.
(hwasan_instrument_writes): New.
(hwasan_memintrin): New.
(report_error_func): Assert not HWASAN.
(build_check_stmt): Make HWASAN_CHECK instead of ASAN_CHECK.
(instrument_derefs): HWASAN does not tag globals.
(instrument_builtin_call): Use new helper functions.
(maybe_instrument_call): Don't instrument `noreturn` functions.
(initialize_sanitizer_builtins): Add new type.
(asan_expand_mark_ifn): Account for HWASAN.
(asan_expand_check_ifn): Assert never called by HWASAN.
(asan_expand_poison_ifn): Account for HWASAN.
(asan_instrument): Branch based on whether using HWASAN or ASAN.
(pass_asan::gate): Return true if sanitizing HWASAN.
(pass_asan_O0::gate): Return true if sanitizing HWASAN.
(hwasan_check_func): New.
(hwasan_expand_check_ifn): New.
(hwasan_expand_mark_ifn): New.
(gate_hwasan): New.
* asan.h (hwasan_expand_check_ifn): New decl.
(hwasan_expand_mark_ifn): New decl.
(gate_hwasan): New decl.
(asan_intercepted_p): Always false for hwasan.
(asan_sanitize_use_after_scope): Account for HWASAN.
* builtin-types.def (BT_FN_PTR_CONST_PTR_UINT8): New.
* gimple-fold.c (gimple_build): New overload for building function
calls without arguments.
(gimple_build_round_up): New.
* gimple-fold.h (gimple_build): New decl.
(gimple_build): New inline function.
(gimple_build_round_up): New decl.
(gimple_build_round_up): New inline function.
* gimple-pretty-print.c (dump_gimple_call_args): Account for
HWASAN.
* gimplify.c (asan_poison_variable): Account for HWASAN.
(gimplify_function_tree): Remove requirement of
SANITIZE_ADDRESS, requiring asan or hwasan is accounted for in
`asan_sanitize_use_after_scope`.
* internal-fn.c (expand_HWASAN_CHECK): New.
(expand_HWASAN_ALLOCA_UNPOISON): New.
(expand_HWASAN_CHOOSE_TAG): New.
(expand_HWASAN_MARK): New.
(expand_HWASAN_SET_TAG): New.
* internal-fn.def (HWASAN_ALLOCA_UNPOISON): New.
(HWASAN_CHOOSE_TAG): New.
(HWASAN_CHECK): New.
(HWASAN_MARK): New.
(HWASAN_SET_TAG): New.
* sanitizer.def (BUILT_IN_HWASAN_LOAD1): New.

Update [PATCH 6/X] libsanitizer: Add hwasan pass and associated gimple changes

2020-11-19 Thread Matthew Malcomson via Gcc-patches

Update to match the change in initialisation for patch 5/X.

MM

---

There are four main features to this change:

1) Check pointer tags match address tags.

When sanitizing for hwasan we now put HWASAN_CHECK internal functions before
memory accesses in the `asan` pass.  This checks that a tag in the pointer
being used match the tag stored in shadow memory for the memory region being
used.

These internal functions are expanded into actual checks in the sanopt
pass that happens just before expansion into RTL.

We use the same mechanism that currently inserts ASAN_CHECK internal
functions to insert the new HWASAN_CHECK functions.

2) Instrument known builtin function calls.

Handle all builtin functions that we know use memory accesses.
This commit uses the machinery added for ASAN to identify builtin
functions that access memory.

The main differences between the approaches for HWASAN and ASAN are:
 - libhwasan intercepts much less builtin functions.
 - Alloca needs to be transformed differently (instead of adding
   redzones it needs to tag shadow memory and return a tagged pointer).
 - stack_restore needs to untag the shadow stack between the current
   position and where it's going.
 - `noreturn` functions can not be handled by simply unpoisoning the
   entire shadow stack -- there is no "always valid" tag.
   (exceptions and things such as longjmp need to be handled in a
   different way, usually in the runtime).

For hardware implemented checking (such as AArch64's memory tagging
extension) alloca and stack_restore will need to be handled by hooks in
the backend rather than transformation at the gimple level.  This will
allow architecture specific handling of such stack modifications.

3) Introduce HWASAN block-scope poisoning

Here we use exactly the same mechanism as ASAN_MARK to poison/unpoison
variables on entry/exit of a block.

In order to simply use the exact same machinery we're using the same
internal functions until the SANOPT pass.  This means that all handling
of ASAN_MARK is the same.
This has the negative that the naming may be a little confusing, but a
positive that handling of the internal function doesn't have to be
duplicated for a function that behaves exactly the same but has a
different name.

gcc/ChangeLog:

* asan.c (asan_instrument_reads): New.
(asan_instrument_writes): New.
(asan_memintrin): New.
(handle_builtin_stack_restore): Account for HWASAN.
(hwasan_emit_round_up): New.
(handle_builtin_alloca): Account for HWASAN.
(get_mem_refs_of_builtin_call): Special case strlen for HWASAN.
(hwasan_instrument_reads): New.
(hwasan_instrument_writes): New.
(hwasan_memintrin): New.
(report_error_func): Assert not HWASAN.
(build_check_stmt): Make HWASAN_CHECK instead of ASAN_CHECK.
(instrument_derefs): HWASAN does not tag globals.
(instrument_builtin_call): Use new helper functions.
(maybe_instrument_call): Don't instrument `noreturn` functions.
(initialize_sanitizer_builtins): Add new type.
(asan_expand_mark_ifn): Account for HWASAN.
(asan_expand_check_ifn): Assert never called by HWASAN.
(asan_expand_poison_ifn): Account for HWASAN.
(asan_instrument): Branch based on whether using HWASAN or ASAN.
(pass_asan::gate): Return true if sanitizing HWASAN.
(pass_asan_O0::gate): Return true if sanitizing HWASAN.
(hwasan_check_func): New.
(hwasan_expand_check_ifn): New.
(hwasan_expand_mark_ifn): New.
(gate_hwasan): New.
* asan.h (hwasan_expand_check_ifn): New decl.
(hwasan_expand_mark_ifn): New decl.
(gate_hwasan): New decl.
(asan_intercepted_p): Always false for hwasan.
(asan_sanitize_use_after_scope): Account for HWASAN.
* builtin-types.def (BT_FN_PTR_CONST_PTR_UINT8): New.
* gimple-pretty-print.c (dump_gimple_call_args): Account for
HWASAN.
* gimplify.c (asan_poison_variable): Account for HWASAN.
(gimplify_function_tree): Remove requirement of
SANITIZE_ADDRESS, requiring asan or hwasan is accounted for in
`asan_sanitize_use_after_scope`.
* internal-fn.c (expand_HWASAN_CHECK): New.
(expand_HWASAN_ALLOCA_UNPOISON): New.
(expand_HWASAN_CHOOSE_TAG): New.
(expand_HWASAN_MARK): New.
(expand_HWASAN_SET_TAG): New.
* internal-fn.def (HWASAN_ALLOCA_UNPOISON): New.
(HWASAN_CHOOSE_TAG): New.
(HWASAN_CHECK): New.
(HWASAN_MARK): New.
(HWASAN_SET_TAG): New.
* sanitizer.def (BUILT_IN_HWASAN_LOAD1): New.
(BUILT_IN_HWASAN_LOAD2): New.
(BUILT_IN_HWASAN_LOAD4): New.
(BUILT_IN_HWASAN_LOAD8): New.
(BUILT_IN_HWASAN_LOAD16): New.
(BUILT_IN_HWASAN_LOADN): New.
(BUILT_IN_HWASAN_STORE1): New.
(BUILT_IN_HWASAN_STORE2): New.
(BUILT_IN_HWASAN_STORE4): New.

Update: [PATCH 5/X] libsanitizer: mid-end: Introduce stack variable handling for HWASAN

2020-11-19 Thread Matthew Malcomson via Gcc-patches
Hi there,

After offline discussion with Richard I've modified the way in which the
initialisation for the hwasan base pointer is emitted.
Originally it was getting emitted during `expand_used_vars`, and
requiring `handle_builtin_alloca` to register a need for it to be
emitted so that `expand_HWASAN_CHOOSE_TAG` can use an initialised base
pointer.

Now we go through the entire expansion of the function body, and then if
`hwasan_frame_base_ptr` was used anywhere we emit the initialisation
just before `parm_birth_insn`.

This is the updated patch for the stack variable handling.
(Testing underway)

MM

---


Handling stack variables has three features.

1) Ensure HWASAN required alignment for stack variables

When tagging shadow memory, we need to ensure that each tag granule is
only used by one variable at a time.

This is done by ensuring that each tagged variable is aligned to the tag
granule representation size and also ensure that the end of each
object is aligned to ensure the start of any other data stored on the
stack is in a different granule.

This patch ensures the above by forcing the stack pointer to be aligned
before and after allocating any stack objects. Since we are forcing
alignment we also use `align_local_variable` to ensure this new alignment
is advertised properly through SET_DECL_ALIGN.

2) Put tags into each stack variable pointer

Make sure that every pointer to a stack variable includes a tag of some
sort on it.

The way tagging works is:
  1) For every new stack frame, a random tag is generated.
  2) A base register is formed from the stack pointer value and this
 random tag.
  3) References to stack variables are now formed with RTL describing an
 offset from this base in both tag and value.

The random tag generation is handled by a backend hook.  This hook
decides whether to introduce a random tag or use the stack background
based on the parameter hwasan-random-frame-tag.  Using the stack
background is necessary for testing and bootstrap.  It is necessary
during bootstrap to avoid breaking the `configure` test program for
determining stack direction.

Using the stack background means that every stack frame has the initial
tag of zero and variables are tagged with incrementing tags from 1,
which also makes debugging a bit easier.

Backend hooks define the size of a tag, the layout of the HWASAN shadow
memory, and handle emitting the code that inserts and extracts tags from a
pointer.

3) For each stack variable, tag and untag the shadow stack on function
   prologue and epilogue.

On entry to each function we tag the relevant shadow stack region for
each stack variable. This stack region is tagged to match the tag added to
each pointer to that variable.

This is the first patch where we use the HWASAN shadow space, so we need
to add in the libhwasan initialisation code that creates this shadow
memory region into the binary we produce.  This instrumentation is done
in `compile_file`.

When exiting a function we need to ensure the shadow stack for this
function has no remaining tags.  Without clearing the shadow stack area
for this stack frame, later function calls could get false positives
when those later function calls check untagged areas (such as parameters
passed on the stack) against a shadow stack area with left-over tag.

Hence we ensure that the entire stack frame is cleared on function exit.

config/ChangeLog:

* bootstrap-hwasan.mk: Disable random frame tags for stack-tagging
during bootstrap.

ChangeLog:

* gcc/asan.c (struct hwasan_stack_var): New.
(hwasan_sanitize_p): New.
(hwasan_sanitize_stack_p): New.
(hwasan_sanitize_allocas_p): New.
(initialize_sanitizer_builtins): Define new builtins.
(ATTR_NOTHROW_LIST): New macro.
(hwasan_current_frame_tag): New.
(hwasan_frame_base): New.
(stack_vars_base_reg_p): New.
(hwasan_maybe_emit_frame_base_init): New.
(hwasan_record_stack_var): New.
(hwasan_get_frame_extent): New.
(hwasan_increment_frame_tag): New.
(hwasan_record_frame_init): New.
(hwasan_emit_prologue): New.
(hwasan_emit_untag_frame): New.
(hwasan_finish_file): New.
(hwasan_truncate_to_tag_size): New.
* gcc/asan.h (hwasan_record_frame_init): New declaration.
(hwasan_record_stack_var): New declaration.
(hwasan_emit_prologue): New declaration.
(hwasan_emit_untag_frame): New declaration.
(hwasan_get_frame_extent): New declaration.
(hwasan_maybe_emit_frame_base_init): New declaration.
(hwasan_frame_base): New declaration.
(stack_vars_base_reg_p): New declaration.
(hwasan_current_frame_tag): New declaration.
(hwasan_increment_frame_tag): New declaration.
(hwasan_truncate_to_tag_size): New declaration.
(hwasan_finish_file): New declaration.
(hwasan_sanitize_p): New declaration.

[PATCH 6/X] libsanitizer: Add hwasan pass and associated gimple changes

2020-11-16 Thread Matthew Malcomson via Gcc-patches
There are four main features to this change:

1) Check pointer tags match address tags.

When sanitizing for hwasan we now put HWASAN_CHECK internal functions before
memory accesses in the `asan` pass.  This checks that a tag in the pointer
being used match the tag stored in shadow memory for the memory region being
used.

These internal functions are expanded into actual checks in the sanopt
pass that happens just before expansion into RTL.

We use the same mechanism that currently inserts ASAN_CHECK internal
functions to insert the new HWASAN_CHECK functions.

2) Instrument known builtin function calls.

Handle all builtin functions that we know use memory accesses.
This commit uses the machinery added for ASAN to identify builtin
functions that access memory.

The main differences between the approaches for HWASAN and ASAN are:
 - libhwasan intercepts much less builtin functions.
 - Alloca needs to be transformed differently (instead of adding
   redzones it needs to tag shadow memory and return a tagged pointer).
 - stack_restore needs to untag the shadow stack between the current
   position and where it's going.
 - `noreturn` functions can not be handled by simply unpoisoning the
   entire shadow stack -- there is no "always valid" tag.
   (exceptions and things such as longjmp need to be handled in a
   different way, usually in the runtime).

For hardware implemented checking (such as AArch64's memory tagging
extension) alloca and stack_restore will need to be handled by hooks in
the backend rather than transformation at the gimple level.  This will
allow architecture specific handling of such stack modifications.

3) Introduce HWASAN block-scope poisoning

Here we use exactly the same mechanism as ASAN_MARK to poison/unpoison
variables on entry/exit of a block.

In order to simply use the exact same machinery we're using the same
internal functions until the SANOPT pass.  This means that all handling
of ASAN_MARK is the same.
This has the negative that the naming may be a little confusing, but a
positive that handling of the internal function doesn't have to be
duplicated for a function that behaves exactly the same but has a
different name.

gcc/ChangeLog:

* asan.c (asan_instrument_reads): New.
(asan_instrument_writes): New.
(asan_memintrin): New.
(handle_builtin_stack_restore): Account for HWASAN.
(hwasan_emit_round_up): New.
(handle_builtin_alloca): Account for HWASAN.
(get_mem_refs_of_builtin_call): Special case strlen for HWASAN.
(hwasan_instrument_reads): New.
(hwasan_instrument_writes): New.
(hwasan_memintrin): New.
(report_error_func): Assert not HWASAN.
(build_check_stmt): Make HWASAN_CHECK instead of ASAN_CHECK.
(instrument_derefs): HWASAN does not tag globals.
(instrument_builtin_call): Use new helper functions.
(maybe_instrument_call): Don't instrument `noreturn` functions.
(initialize_sanitizer_builtins): Add new type.
(asan_expand_mark_ifn): Account for HWASAN.
(asan_expand_check_ifn): Assert never called by HWASAN.
(asan_expand_poison_ifn): Account for HWASAN.
(asan_instrument): Branch based on whether using HWASAN or ASAN.
(pass_asan::gate): Return true if sanitizing HWASAN.
(pass_asan_O0::gate): Return true if sanitizing HWASAN.
(hwasan_check_func): New.
(hwasan_expand_check_ifn): New.
(hwasan_expand_mark_ifn): New.
(gate_hwasan): New.
* asan.h (hwasan_expand_check_ifn): New decl.
(hwasan_expand_mark_ifn): New decl.
(gate_hwasan): New decl.
(asan_intercepted_p): Always false for hwasan.
(asan_sanitize_use_after_scope): Account for HWASAN.
* builtin-types.def (BT_FN_PTR_CONST_PTR_UINT8): New.
* gimple-pretty-print.c (dump_gimple_call_args): Account for
HWASAN.
* gimplify.c (asan_poison_variable): Account for HWASAN.
(gimplify_function_tree): Remove requirement of
SANITIZE_ADDRESS, requiring asan or hwasan is accounted for in
`asan_sanitize_use_after_scope`.
* internal-fn.c (expand_HWASAN_CHECK): New.
(expand_HWASAN_ALLOCA_UNPOISON): New.
(expand_HWASAN_CHOOSE_TAG): New.
(expand_HWASAN_MARK): New.
(expand_HWASAN_SET_TAG): New.
* internal-fn.def (HWASAN_ALLOCA_UNPOISON): New.
(HWASAN_CHOOSE_TAG): New.
(HWASAN_CHECK): New.
(HWASAN_MARK): New.
(HWASAN_SET_TAG): New.
* sanitizer.def (BUILT_IN_HWASAN_LOAD1): New.
(BUILT_IN_HWASAN_LOAD2): New.
(BUILT_IN_HWASAN_LOAD4): New.
(BUILT_IN_HWASAN_LOAD8): New.
(BUILT_IN_HWASAN_LOAD16): New.
(BUILT_IN_HWASAN_LOADN): New.
(BUILT_IN_HWASAN_STORE1): New.
(BUILT_IN_HWASAN_STORE2): New.
(BUILT_IN_HWASAN_STORE4): New.
(BUILT_IN_HWASAN_STORE8): New.
(BUILT_IN_HWASAN_STORE16): New.

[PATCH 5/X] libsanitizer: mid-end: Introduce stack variable handling for HWASAN

2020-11-16 Thread Matthew Malcomson via Gcc-patches
Handling stack variables has three features.

1) Ensure HWASAN required alignment for stack variables

When tagging shadow memory, we need to ensure that each tag granule is
only used by one variable at a time.

This is done by ensuring that each tagged variable is aligned to the tag
granule representation size and also ensure that the end of each
object is aligned to ensure the start of any other data stored on the
stack is in a different granule.

This patch ensures the above by forcing the stack pointer to be aligned
before and after allocating any stack objects. Since we are forcing
alignment we also use `align_local_variable` to ensure this new alignment
is advertised properly through SET_DECL_ALIGN.

2) Put tags into each stack variable pointer

Make sure that every pointer to a stack variable includes a tag of some
sort on it.

The way tagging works is:
  1) For every new stack frame, a random tag is generated.
  2) A base register is formed from the stack pointer value and this
 random tag.
  3) References to stack variables are now formed with RTL describing an
 offset from this base in both tag and value.

The random tag generation is handled by a backend hook.  This hook
decides whether to introduce a random tag or use the stack background
based on the parameter hwasan-random-frame-tag.  Using the stack
background is necessary for testing and bootstrap.  It is necessary
during bootstrap to avoid breaking the `configure` test program for
determining stack direction.

Using the stack background means that every stack frame has the initial
tag of zero and variables are tagged with incrementing tags from 1,
which also makes debugging a bit easier.

Backend hooks define the size of a tag, the layout of the HWASAN shadow
memory, and handle emitting the code that inserts and extracts tags from a
pointer.

3) For each stack variable, tag and untag the shadow stack on function
   prologue and epilogue.

On entry to each function we tag the relevant shadow stack region for
each stack variable. This stack region is tagged to match the tag added to
each pointer to that variable.

This is the first patch where we use the HWASAN shadow space, so we need
to add in the libhwasan initialisation code that creates this shadow
memory region into the binary we produce.  This instrumentation is done
in `compile_file`.

When exiting a function we need to ensure the shadow stack for this
function has no remaining tags.  Without clearing the shadow stack area
for this stack frame, later function calls could get false positives
when those later function calls check untagged areas (such as parameters
passed on the stack) against a shadow stack area with left-over tag.

Hence we ensure that the entire stack frame is cleared on function exit.

config/ChangeLog:

* bootstrap-hwasan.mk: Disable random frame tags for stack-tagging
during bootstrap.

ChangeLog:

* gcc/asan.c (struct hwasan_stack_var): New.
(hwasan_sanitize_p): New.
(hwasan_sanitize_stack_p): New.
(hwasan_sanitize_allocas_p): New.
(initialize_sanitizer_builtins): Define new builtins.
(ATTR_NOTHROW_LIST): New macro.
(hwasan_current_frame_tag): New.
(hwasan_frame_base): New.
(stack_vars_base_reg_p): New.
(hwasan_maybe_init_frame_base): New.
(hwasan_record_frame_base_needed): New.
(hwasan_record_stack_var): New.
(hwasan_get_frame_extent): New.
(hwasan_increment_frame_tag): New.
(hwasan_record_frame_init): New.
(hwasan_emit_prologue): New.
(hwasan_emit_untag_frame): New.
(hwasan_finish_file): New.
(hwasan_truncate_to_tag_size): New.
* gcc/asan.h (hwasan_record_frame_init): New declaration.
(hwasan_record_stack_var): New declaration.
(hwasan_emit_prologue): New declaration.
(hwasan_emit_untag_frame): New declaration.
(hwasan_get_frame_extent): New declaration.
(hwasan_maybe_init_frame_base): New declaration.
(hwasan_frame_base): New declaration.
(stack_vars_base_reg_p): New declaration.
(hwasan_record_frame_base_needed): New declaration.
(hwasan_current_frame_tag): New declaration.
(hwasan_increment_frame_tag): New declaration.
(hwasan_truncate_to_tag_size): New declaration.
(hwasan_finish_file): New declaration.
(hwasan_sanitize_p): New declaration.
(hwasan_sanitize_stack_p): New declaration.
(hwasan_sanitize_allocas_p): New declaration.
(HWASAN_TAG_SIZE): New macro.
(HWASAN_TAG_GRANULE_SIZE): New macro.
(HWASAN_STACK_BACKGROUND): New macro.
* gcc/builtin-types.def (BT_FN_VOID_PTR_UINT8_PTRMODE): New.
* gcc/builtins.def (DEF_SANITIZER_BUILTIN): Enable for HWASAN.
* gcc/cfgexpand.c (align_local_variable): When using hwasan ensure
alignment to tag granule.
(align_frame_offset): New.

[PATCH 1/X] libsanitizer: Tie the hwasan library into our build system

2020-11-16 Thread Matthew Malcomson via Gcc-patches
This patch tries to tie libhwasan into the GCC build system in the same way
that the other sanitizer runtime libraries are handled.

libsanitizer/ChangeLog:

* Makefile.am:  Build libhwasan.
* Makefile.in:  Build libhwasan.
* asan/Makefile.in:  Build libhwasan.
* configure:  Build libhwasan.
* configure.ac:  Build libhwasan.
* hwasan/Makefile.am: New file.
* hwasan/Makefile.in: New file.
* hwasan/libtool-version: New file.
* interception/Makefile.in: Build libhwasan.
* libbacktrace/Makefile.in: Build libhwasan.
* libsanitizer.spec.in: Build libhwasan.
* lsan/Makefile.in: Build libhwasan.
* sanitizer_common/Makefile.in: Build libhwasan.
* tsan/Makefile.in: Build libhwasan.
* ubsan/Makefile.in: Build libhwasan.



### Attachment also inlined for ease of reply###


diff --git a/libsanitizer/Makefile.am b/libsanitizer/Makefile.am
index 
65ed1e712378ef453f820f86c4d3221f9dee5f2c..2a7e8e1debe838719db0f0fad218b2543cc3111b
 100644
--- a/libsanitizer/Makefile.am
+++ b/libsanitizer/Makefile.am
@@ -14,11 +14,12 @@ endif
 if LIBBACKTRACE_SUPPORTED
 SUBDIRS += libbacktrace
 endif
-SUBDIRS += lsan asan ubsan
+SUBDIRS += lsan asan ubsan hwasan
 nodist_saninclude_HEADERS += \
   include/sanitizer/lsan_interface.h \
   include/sanitizer/asan_interface.h \
-  include/sanitizer/tsan_interface.h
+  include/sanitizer/tsan_interface.h \
+  include/sanitizer/hwasan_interface.h
 if TSAN_SUPPORTED
 SUBDIRS += tsan
 endif
diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in
index 
02c7f70ac6578a3e93a490ce8bd2c54fc0693c50..2c57d49cbffdb486645aeb5f2c0f85d6e0fad124
 100644
--- a/libsanitizer/Makefile.in
+++ b/libsanitizer/Makefile.in
@@ -92,7 +92,8 @@ target_triplet = @target@
 @SANITIZER_SUPPORTED_TRUE@am__append_1 = 
include/sanitizer/common_interface_defs.h \
 @SANITIZER_SUPPORTED_TRUE@ include/sanitizer/lsan_interface.h \
 @SANITIZER_SUPPORTED_TRUE@ include/sanitizer/asan_interface.h \
-@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/tsan_interface.h
+@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/tsan_interface.h \
+@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/hwasan_interface.h
 @SANITIZER_SUPPORTED_TRUE@@USING_MAC_INTERPOSE_FALSE@am__append_2 = 
interception
 @LIBBACKTRACE_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_3 = 
libbacktrace
 @SANITIZER_SUPPORTED_TRUE@@TSAN_SUPPORTED_TRUE@am__append_4 = tsan
@@ -207,7 +208,7 @@ ETAGS = etags
 CTAGS = ctags
 CSCOPE = cscope
 DIST_SUBDIRS = sanitizer_common interception libbacktrace lsan asan \
-   ubsan tsan
+   ubsan hwasan tsan
 ACLOCAL = @ACLOCAL@
 ALLOC_FILE = @ALLOC_FILE@
 AMTAR = @AMTAR@
@@ -329,6 +330,7 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libasan = @link_libasan@
+link_libhwasan = @link_libhwasan@
 link_liblsan = @link_liblsan@
 link_libtsan = @link_libtsan@
 link_libubsan = @link_libubsan@
@@ -362,7 +364,7 @@ sanincludedir = 
$(libdir)/gcc/$(target_alias)/$(gcc_version)/include/sanitizer
 nodist_saninclude_HEADERS = $(am__append_1)
 @SANITIZER_SUPPORTED_TRUE@SUBDIRS = sanitizer_common $(am__append_2) \
 @SANITIZER_SUPPORTED_TRUE@ $(am__append_3) lsan asan ubsan \
-@SANITIZER_SUPPORTED_TRUE@ $(am__append_4)
+@SANITIZER_SUPPORTED_TRUE@ hwasan $(am__append_4)
 gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
diff --git a/libsanitizer/asan/Makefile.in b/libsanitizer/asan/Makefile.in
index 
29622bf466a37f819c9fade30e31195adda51190..25c7fd7b7597d6e243005a1bb7de5b6243d2cfcf
 100644
--- a/libsanitizer/asan/Makefile.in
+++ b/libsanitizer/asan/Makefile.in
@@ -383,6 +383,7 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libasan = @link_libasan@
+link_libhwasan = @link_libhwasan@
 link_liblsan = @link_liblsan@
 link_libtsan = @link_libtsan@
 link_libubsan = @link_libubsan@
diff --git a/libsanitizer/configure b/libsanitizer/configure
index 
04eca04fbe5e59bae1ba00597de0cf1b7cf1b5fa..27e72c089cb891dcce09494fa9e39eebe55d2598
 100755
--- a/libsanitizer/configure
+++ b/libsanitizer/configure
@@ -657,6 +657,7 @@ USING_MAC_INTERPOSE_TRUE
 link_liblsan
 link_libubsan
 link_libtsan
+link_libhwasan
 link_libasan
 LSAN_SUPPORTED_FALSE
 LSAN_SUPPORTED_TRUE
@@ -12361,7 +12362,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12364 "configure"
+#line 12365 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -12467,7 +12468,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12470 "configure"
+#line 12471 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -15943,6 +15944,10 @@ fi
 link_libasan=$link_sanitizer_common
 
 
+# Set up the set of additional 

[PATCH 4/X] libsanitizer: options: Add hwasan flags and argument parsing

2020-11-16 Thread Matthew Malcomson via Gcc-patches
These flags can't be used at the same time as any of the other
sanitizers.
We add an equivalent flag to -static-libasan in -static-libhwasan to
ensure static linking.

The -fsanitize=kernel-hwaddress option is for compiling targeting the
kernel.  This flag has defaults to match the LLVM implementation and
sets some other behaviors to work in the kernel (e.g. accounting for
the fact that the stack pointer will have 0xff in the top byte and to not
call the userspace library initialisation routines).
The defaults are that we do not sanitize variables on the stack and
always recover from a detected bug.

Since we are introducing a few more conflicts between sanitizer flags we
refactor the checking for such conflicts to use a helper function which
makes checking for such conflicts more easy and consistent.

We introduce a backend hook `targetm.memtag.can_tag_addresses` that
indicates to the mid-end whether a target has a feature like AArch64 TBI
where the top byte of an address is ignored.
Without this feature hwasan sanitization is not done.

gcc/ChangeLog:

* common.opt (flag_sanitize_recover): Default for kernel
hwaddress.
(static-libhwasan): New cli option.
* config/aarch64/aarch64.c (aarch64_can_tag_addresses): New.
(TARGET_MEMTAG_CAN_TAG_ADDRESSES): New.
* config/gnu-user.h (LIBHWASAN_EARLY_SPEC): hwasan equivalent of
asan command line flags.
* cppbuiltin.c (define_builtin_macros_for_compilation_flags):
Add hwasan equivalent of __SANITIZE_ADDRESS__.
* doc/invoke.texi: Document hwasan command line flags.
* doc/tm.texi: Document new hook.
* doc/tm.texi.in: Document new hook.
* flag-types.h (enum sanitize_code): New sanitizer values.
* gcc.c (STATIC_LIBHWASAN_LIBS): New macro.
(LIBHWASAN_SPEC): New macro.
(LIBHWASAN_EARLY_SPEC): New macro.
(SANITIZER_EARLY_SPEC): Update to include hwasan.
(SANITIZER_SPEC): Update to include hwasan.
(sanitize_spec_function): Use hwasan options.
* opts.c (finish_options): Describe conflicts between address
sanitizers.
(find_argument): New.
(report_conflicting_sanitizer_options): New.
(sanitizer_opts): Introduce new sanitizer flags.
(common_handle_option): Add defaults for kernel sanitizer.
* params.opt (hwasan--instrument-stack): New
(hwasan-random-frame-tag): New
(hwasan-instrument-allocas): New
(hwasan-instrument-reads): New
(hwasan-instrument-writes): New
(hwasan-instrument-mem-intrinsics): New
* target.def (HOOK_PREFIX): Add new hook.
(can_tag_addresses): Add new hook under memtag prefix.
* targhooks.c (default_memtag_can_tag_addresses): New.
* targhooks.h (default_memtag_can_tag_addresses): New decl.
* toplev.c (process_options): Ensure hwasan only on
architectures that advertise the possibility.



### Attachment also inlined for ease of reply###


diff --git a/gcc/common.opt b/gcc/common.opt
index 
d4cbb2f86a554fa2f87e52b2b6cbd1de27ddaa31..c7eec0fe683b817233ab4572024c16542a646cb4
 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -218,7 +218,7 @@ unsigned int flag_sanitize
 
 ; What sanitizers should recover from errors
 Variable
-unsigned int flag_sanitize_recover = (SANITIZE_UNDEFINED | 
SANITIZE_UNDEFINED_NONDEFAULT | SANITIZE_KERNEL_ADDRESS) & 
~(SANITIZE_UNREACHABLE | SANITIZE_RETURN)
+unsigned int flag_sanitize_recover = (SANITIZE_UNDEFINED | 
SANITIZE_UNDEFINED_NONDEFAULT | SANITIZE_KERNEL_ADDRESS | 
SANITIZE_KERNEL_HWADDRESS) & ~(SANITIZE_UNREACHABLE | SANITIZE_RETURN)
 
 ; What the coverage sanitizers should instrument
 Variable
@@ -3429,6 +3429,9 @@ Driver
 static-libasan
 Driver
 
+static-libhwasan
+Driver
+
 static-libtsan
 Driver
 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
27f587be7e7483093f1fd381d8fa14a6c6581ccf..ea85aadb619eb2e9ee7328f035f81aa8578ee3ad
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -23175,6 +23175,15 @@ aarch64_invalid_binary_op (int op ATTRIBUTE_UNUSED, 
const_tree type1,
   return NULL;
 }
 
+/* Implement TARGET_MEMTAG_CAN_TAG_ADDRESSES.  Here we tell the rest of the
+   compiler that we automatically ignore the top byte of our pointers, which
+   allows using -fsanitize=hwaddress.  */
+bool
+aarch64_can_tag_addresses ()
+{
+  return true;
+}
+
 /* Implement TARGET_ASM_FILE_END for AArch64.  This adds the AArch64 GNU NOTE
section at the end if needed.  */
 #define GNU_PROPERTY_AARCH64_FEATURE_1_AND 0xc000
@@ -23994,6 +24003,9 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_FNTYPE_ABI
 #define TARGET_FNTYPE_ABI aarch64_fntype_abi
 
+#undef TARGET_MEMTAG_CAN_TAG_ADDRESSES
+#define TARGET_MEMTAG_CAN_TAG_ADDRESSES aarch64_can_tag_addresses
+
 #if CHECKING_P
 #undef TARGET_RUN_TARGET_SELFTESTS
 #define 

[PATCH 3/X] libsanitizer: Add option to bootstrap using HWASAN

2020-11-16 Thread Matthew Malcomson via Gcc-patches
This is an analogous option to --bootstrap-asan to configure.  It allows
bootstrapping GCC using HWASAN.

For the same reasons as for ASAN we have to avoid using the HWASAN
sanitizer when compiling libiberty and the lto-plugin.

Also add a function to query whether -fsanitize=hwaddress has been
passed.

ChangeLog:

* configure: Regenerate.
* configure.ac: Add --bootstrap-hwasan option.

config/ChangeLog:

* bootstrap-hwasan.mk: New file.

gcc/ChangeLog:

* doc/install.texi: Document new option.

libiberty/ChangeLog:

* configure: Regenerate.
* configure.ac: Avoid using sanitizer.

lto-plugin/ChangeLog:

* Makefile.am: Avoid using sanitizer.
* Makefile.in: Regenerate.



### Attachment also inlined for ease of reply###


diff --git a/config/bootstrap-hwasan.mk b/config/bootstrap-hwasan.mk
new file mode 100644
index 
..4f60bed3fd6e98b47a3a38aea6eba2a7c320da25
--- /dev/null
+++ b/config/bootstrap-hwasan.mk
@@ -0,0 +1,8 @@
+# This option enables -fsanitize=hwaddress for stage2 and stage3.
+
+STAGE2_CFLAGS += -fsanitize=hwaddress
+STAGE3_CFLAGS += -fsanitize=hwaddress
+POSTSTAGE1_LDFLAGS += -fsanitize=hwaddress -static-libhwasan \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/ \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/ \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/.libs
diff --git a/configure b/configure
index 
a2ea1a329b69de06906315e54a49c694c9704522..b41a258c80ee9f289de534185eb364bcb5ca6ae5
 100755
--- a/configure
+++ b/configure
@@ -9305,7 +9305,7 @@ fi
 # or bootstrap-ubsan, bootstrap it.
 if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; then
   case "$BUILD_CONFIG" in
-*bootstrap-asan* | *bootstrap-ubsan* )
+*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* )
   bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer,
   bootstrap_fixincludes=yes
   ;;
diff --git a/configure.ac b/configure.ac
index 
44fa75f3a329ef68f6800c8e09d49a9373f731cf..944f30cfea84e9266b4322df7902b867882d4d8c
 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2814,7 +2814,7 @@ fi
 # or bootstrap-ubsan, bootstrap it.
 if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; then
   case "$BUILD_CONFIG" in
-*bootstrap-asan* | *bootstrap-ubsan* )
+*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* )
   bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer,
   bootstrap_fixincludes=yes
   ;;
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 
60ee0a9dba17bf8f00d7c5320468ba847f08f8a0..e0f75b3d55582b5d1b8718f94e39c7d50656a926
 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2794,6 +2794,11 @@ the build tree.
 Compiles GCC itself using Address Sanitization in order to catch invalid memory
 accesses within the GCC code.
 
+@item @samp{bootstrap-hwasan}
+Compiles GCC itself using HWAddress Sanitization in order to catch invalid
+memory accesses within the GCC code.  This option is only available on AArch64
+systems that are running Linux kernel version 5.4 or later.
+
 @end table
 
 @section Building a cross compiler
diff --git a/libiberty/configure b/libiberty/configure
index 
ff93c9ee9a6fa9c6bb1938bcdfffb2d0ae8c9698..b6af9baf21204a323cad0e7b40a426c72988ba3b
 100755
--- a/libiberty/configure
+++ b/libiberty/configure
@@ -5264,6 +5264,7 @@ fi
 NOASANFLAG=
 case " ${CFLAGS} " in
   *\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;;
+  *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;;
 esac
 
 
diff --git a/libiberty/configure.ac b/libiberty/configure.ac
index 
4e2599c14a89bafcb8c7e523b9ce5b3d60b8c0f6..ad952963971a31968b5d109661b9cab0aa4b95fc
 100644
--- a/libiberty/configure.ac
+++ b/libiberty/configure.ac
@@ -240,6 +240,7 @@ AC_SUBST(PICFLAG)
 NOASANFLAG=
 case " ${CFLAGS} " in
   *\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;;
+  *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;;
 esac
 AC_SUBST(NOASANFLAG)
 
diff --git a/lto-plugin/Makefile.am b/lto-plugin/Makefile.am
index 
204b25f45ef2f22bb246641a2aa9f9d09719737b..8b20e1d1d87e2dda9f37763492ddf39a8022c48c
 100644
--- a/lto-plugin/Makefile.am
+++ b/lto-plugin/Makefile.am
@@ -11,8 +11,8 @@ AM_CPPFLAGS = -I$(top_srcdir)/../include $(DEFS)
 AM_CFLAGS = @ac_lto_plugin_warn_cflags@ $(CET_HOST_FLAGS)
 AM_LDFLAGS = @ac_lto_plugin_ldflags@
 AM_LIBTOOLFLAGS = --tag=disable-static
-override CFLAGS := $(filter-out -fsanitize=address,$(CFLAGS))
-override LDFLAGS := $(filter-out -fsanitize=address,$(LDFLAGS))
+override CFLAGS := $(filter-out -fsanitize=address 
-fsanitize=hwaddress,$(CFLAGS))
+override LDFLAGS := $(filter-out -fsanitize=address 
-fsanitize=hwaddress,$(LDFLAGS))
 
 libexecsub_LTLIBRARIES = liblto_plugin.la
 gcc_build_dir = @gcc_build_dir@
diff --git a/lto-plugin/Makefile.in 

[PATCH 7/X] libsanitizer: Add tests

2020-11-16 Thread Matthew Malcomson via Gcc-patches
Adding hwasan tests.

Only interesting thing here is that we have to make sure the tagging mechanism
is deterministic to avoid flaky tests.

gcc/testsuite/ChangeLog:

* c-c++-common/ubsan/sanitize-recover-7.c:
* c-c++-common/hwasan/aligned-alloc.c: New test.
* c-c++-common/hwasan/alloca-array-accessible.c: New test.
* c-c++-common/hwasan/alloca-base-init.c: New test.
* c-c++-common/hwasan/alloca-gets-different-tag.c: New test.
* c-c++-common/hwasan/alloca-outside-caught.c: New test.
* c-c++-common/hwasan/arguments-1.c: New test.
* c-c++-common/hwasan/arguments-2.c: New test.
* c-c++-common/hwasan/arguments-3.c: New test.
* c-c++-common/hwasan/arguments.c: New test.
* c-c++-common/hwasan/asan-pr63316.c: New test.
* c-c++-common/hwasan/asan-pr70541.c: New test.
* c-c++-common/hwasan/asan-pr78106.c: New test.
* c-c++-common/hwasan/asan-pr79944.c: New test.
* c-c++-common/hwasan/asan-rlimit-mmap-test-1.c: New test.
* c-c++-common/hwasan/bitfield-1.c: New test.
* c-c++-common/hwasan/bitfield-2.c: New test.
* c-c++-common/hwasan/builtin-special-handling.c: New test.
* c-c++-common/hwasan/check-interface.c: New test.
* c-c++-common/hwasan/halt_on_error-1.c: New test.
* c-c++-common/hwasan/handles-poly_int-marked-vars.c: New test.
* c-c++-common/hwasan/heap-overflow.c: New test.
* c-c++-common/hwasan/hwasan-poison-optimisation.c: New test.
* c-c++-common/hwasan/hwasan-thread-access-parent.c: New test.
* c-c++-common/hwasan/hwasan-thread-basic-failure.c: New test.
* c-c++-common/hwasan/hwasan-thread-clears-stack.c: New test.
* c-c++-common/hwasan/hwasan-thread-success.c: New test.
* c-c++-common/hwasan/kernel-defaults.c: New test.
* c-c++-common/hwasan/large-aligned-0.c: New test.
* c-c++-common/hwasan/large-aligned-1.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-0.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-1.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-2.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-3.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-4.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-5.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-6.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-7.c: New test.
* c-c++-common/hwasan/macro-definition.c: New test.
* c-c++-common/hwasan/no-sanitize-attribute.c: New test.
* c-c++-common/hwasan/param-instrument-mem-intrinsics.c: New test.
* c-c++-common/hwasan/param-instrument-reads-and-writes.c: New test.
* c-c++-common/hwasan/param-instrument-reads.c: New test.
* c-c++-common/hwasan/param-instrument-writes.c: New test.
* c-c++-common/hwasan/random-frame-tag.c: New test.
* c-c++-common/hwasan/sanity-check-pure-c.c: New test.
* c-c++-common/hwasan/setjmp-longjmp-0.c: New test.
* c-c++-common/hwasan/setjmp-longjmp-1.c: New test.
* c-c++-common/hwasan/stack-tagging-basic-0.c: New test.
* c-c++-common/hwasan/stack-tagging-basic-1.c: New test.
* c-c++-common/hwasan/stack-tagging-disable.c: New test.
* c-c++-common/hwasan/unprotected-allocas-0.c: New test.
* c-c++-common/hwasan/unprotected-allocas-1.c: New test.
* c-c++-common/hwasan/use-after-free.c: New test.
* c-c++-common/hwasan/vararray-outside-caught.c: New test.
* c-c++-common/hwasan/vararray-stack-restore-correct.c: New test.
* c-c++-common/hwasan/very-large-objects.c: New test.
* g++.dg/hwasan/hwasan.exp: New test.
* g++.dg/hwasan/rvo-handled.C: New test.
* gcc.dg/hwasan/hwasan.exp: New test.
* gcc.dg/hwasan/nested-functions-0.c: New test.
* gcc.dg/hwasan/nested-functions-1.c: New test.
* gcc.dg/hwasan/nested-functions-2.c: New test.
* lib/hwasan-dg.exp: New test.


hwasan-v56.patch.gz
Description: application/gzip


[PATCH 2/X] libsanitizer: Only build libhwasan when targeting AArch64

2020-11-16 Thread Matthew Malcomson via Gcc-patches
Though the library has limited support for x86, we don't have any
support for generating code targeting x86 so there is no point building
for that target.

Ensure we build for AArch64 but not for AArch64 ilp32.

libsanitizer/ChangeLog:

* Makefile.am: Condition Build hwasan directory.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac: Set HWASAN_SUPPORTED based on target
architecture.
* configure.tgt: Likewise.



### Attachment also inlined for ease of reply###


diff --git a/libsanitizer/Makefile.am b/libsanitizer/Makefile.am
index 
2a7e8e1debe838719db0f0fad218b2543cc3111b..065a65e78d49f7689a01ecb64db1f07ca83aa987
 100644
--- a/libsanitizer/Makefile.am
+++ b/libsanitizer/Makefile.am
@@ -14,7 +14,7 @@ endif
 if LIBBACKTRACE_SUPPORTED
 SUBDIRS += libbacktrace
 endif
-SUBDIRS += lsan asan ubsan hwasan
+SUBDIRS += lsan asan ubsan
 nodist_saninclude_HEADERS += \
   include/sanitizer/lsan_interface.h \
   include/sanitizer/asan_interface.h \
@@ -23,6 +23,9 @@ nodist_saninclude_HEADERS += \
 if TSAN_SUPPORTED
 SUBDIRS += tsan
 endif
+if HWASAN_SUPPORTED
+SUBDIRS += hwasan
+endif
 endif
 
 ## May be used by toolexeclibdir.
diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in
index 
2c57d49cbffdb486645aeb5f2c0f85d6e0fad124..3873ea4d7050f04a3f7bbd0dd3f2a71e9b65d287
 100644
--- a/libsanitizer/Makefile.in
+++ b/libsanitizer/Makefile.in
@@ -97,6 +97,7 @@ target_triplet = @target@
 @SANITIZER_SUPPORTED_TRUE@@USING_MAC_INTERPOSE_FALSE@am__append_2 = 
interception
 @LIBBACKTRACE_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_3 = 
libbacktrace
 @SANITIZER_SUPPORTED_TRUE@@TSAN_SUPPORTED_TRUE@am__append_4 = tsan
+@HWASAN_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_5 = hwasan
 subdir = .
 ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
 am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \
@@ -208,7 +209,7 @@ ETAGS = etags
 CTAGS = ctags
 CSCOPE = cscope
 DIST_SUBDIRS = sanitizer_common interception libbacktrace lsan asan \
-   ubsan hwasan tsan
+   ubsan tsan hwasan
 ACLOCAL = @ACLOCAL@
 ALLOC_FILE = @ALLOC_FILE@
 AMTAR = @AMTAR@
@@ -364,7 +365,7 @@ sanincludedir = 
$(libdir)/gcc/$(target_alias)/$(gcc_version)/include/sanitizer
 nodist_saninclude_HEADERS = $(am__append_1)
 @SANITIZER_SUPPORTED_TRUE@SUBDIRS = sanitizer_common $(am__append_2) \
 @SANITIZER_SUPPORTED_TRUE@ $(am__append_3) lsan asan ubsan \
-@SANITIZER_SUPPORTED_TRUE@ hwasan $(am__append_4)
+@SANITIZER_SUPPORTED_TRUE@ $(am__append_4) $(am__append_5)
 gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
diff --git a/libsanitizer/configure b/libsanitizer/configure
index 
27e72c089cb891dcce09494fa9e39eebe55d2598..720d4e17044170e4b91c42fede685761d98c1965
 100755
--- a/libsanitizer/configure
+++ b/libsanitizer/configure
@@ -659,6 +659,8 @@ link_libubsan
 link_libtsan
 link_libhwasan
 link_libasan
+HWASAN_SUPPORTED_FALSE
+HWASAN_SUPPORTED_TRUE
 LSAN_SUPPORTED_FALSE
 LSAN_SUPPORTED_TRUE
 TSAN_SUPPORTED_FALSE
@@ -12362,7 +12364,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12365 "configure"
+#line 12367 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -12468,7 +12470,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12471 "configure"
+#line 12473 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -15819,6 +15821,7 @@ fi
 # Get target configury.
 unset TSAN_SUPPORTED
 unset LSAN_SUPPORTED
+unset HWASAN_SUPPORTED
 . ${srcdir}/configure.tgt
  if test "x$TSAN_SUPPORTED" = "xyes"; then
   TSAN_SUPPORTED_TRUE=
@@ -15836,6 +15839,14 @@ else
   LSAN_SUPPORTED_FALSE=
 fi
 
+ if test "x$HWASAN_SUPPORTED" = "xyes"; then
+  HWASAN_SUPPORTED_TRUE=
+  HWASAN_SUPPORTED_FALSE='#'
+else
+  HWASAN_SUPPORTED_TRUE='#'
+  HWASAN_SUPPORTED_FALSE=
+fi
+
 
 # Check for functions needed.
 for ac_func in clock_getres clock_gettime clock_settime lstat readlink
@@ -16818,7 +16829,7 @@ ac_config_files="$ac_config_files Makefile 
libsanitizer.spec libbacktrace/backtr
 ac_config_headers="$ac_config_headers config.h"
 
 
-ac_config_files="$ac_config_files interception/Makefile 
sanitizer_common/Makefile libbacktrace/Makefile lsan/Makefile asan/Makefile 
hwasan/Makefile ubsan/Makefile"
+ac_config_files="$ac_config_files interception/Makefile 
sanitizer_common/Makefile libbacktrace/Makefile lsan/Makefile asan/Makefile 
ubsan/Makefile"
 
 
 if test "x$TSAN_SUPPORTED" = "xyes"; then
@@ -16826,6 +16837,11 @@ if test "x$TSAN_SUPPORTED" = "xyes"; then
 
 fi
 
+if test "x$HWASAN_SUPPORTED" = "xyes"; then
+  ac_config_files="$ac_config_files hwasan/Makefile"
+
+fi
+
 
 
 
@@ -17090,6 +17106,10 @@ if test -z "${LSAN_SUPPORTED_TRUE}" && test -z 
"${LSAN_SUPPORTED_FALSE}"; then
   as_fn_error $? 

Hwasan v5

2020-11-16 Thread Matthew Malcomson via Gcc-patches
Hello,

Bootstrapped and regression tested on AArch64
(have not finished running tests on the Linux Kernel yet).
Sending upstream to catch next set of comments early.

This version mainly just consists of requested changes.

Some notable exceptions:

Have rebased onto a newer version, there was a change to `align_local_variable`
that meant some merge conflict resolutions were needed (though no functional
change expected).

Alignment of stack after object:
  In Patch 5/X (for tagging objects on the stack) I realised that in some code
  paths the current method was not ensuring alignment of the stack *after* the
  object in memory for !FRAME_GROWS_DOWNWARDS.
  There is a comment in `expand_stack_vars`, alongside the update.

  Originally we used `align_frame_offset` before space for the object gets
  allocated to ensure the stack is aligned after the object, and
  `alloc_stack_frame_space` with relevant alignment to ensure the stack is
  aligned before the object.

  When !FRAME_GROWS_DOWNWARDS, using `align_frame_offset` before the object
  space is allocated does not ensure alignment of the next object.  Rather
  we must use `align_frame_offset` after space is allocated.

  I now use `align_frame_offset` before *and* after `alloc_stack_frame_space`.
  This looks nicer than putting if conditions based on FRAME_GROWS_DOWNWARDS
  and does not end up allocating any extra space than only calling the align
  function once (it just means the alignment from the `alloc_stack_frame_space`
  is superfluous).

  I still ensure `align_local_variable` returns the correct alignment, even
  though it makes no functional difference.  This seems less likely to cause
  confusion to me.  Similarly, this function still uses `SET_DECL_ALIGN` to
  correctly advertise the alignment that the stack variable is aligned to.
  (N.b. testing does show that leaving `align_local_variable` as it is doesn't
  break anything, which is as expected since the alignment is ensured by the
  calls to `align_frame_offset`).


Ensure Alloca causes hwasan frame base to get initialised.
  In the last version of this patch `expand_HWASAN_CHOOSE_TAG` would use the
  hwasan frame base using `hwasan_base`.  Since this function is used after
  `expand_stack_vars` and hence after `hwasan_emit_prologue` that would not
  cause the initialisation of the frame base to be emitted, but would just use
  the pseudo register without any initialisation.

  This patch records that initialisation should be emitted when creating the
  `HWASAN_CHOOSE_TAG` internal function -- so that even if there are no
  statically allocated local variables the hwasan frame base will be
  initialisied.

  New test was introduced to catch this.


Introduce HWASAN_SET_TAG internal function.
  This was something that popped up during testing the linux kernel with
  stack-tagging.  The __hwasan_tag_pointer interface exported by libhwasan is
  not available in the kernel.  Since the main target of putting this feature
  in GCC is for the kernel we deal with this by emitting code inline to set a
  tag in a pointer.


New test to ensure HWASAN_MARK can handle poly_int sizes.
  The code is designed to handle any variable sized object, but the only sizes
  that could actually be used happen to be poly_int sized variables.

  This is essentially the testcase from pr97696.  HWASAN can handle such size
  types (though ASAN can't) so this should pass for HWASAN.

Thanks,
Matthew
  
Entire patch series attached to cover letter.

all-patches.tar.gz
Description: application/gzip


Re: [Patch 0/X] HWASAN v4

2020-11-13 Thread Matthew Malcomson via Gcc-patches
Hi there,

Thanks for the heads-up.
As it turns out the most recent `libhwasan` crashes when displaying an address 
on the stack in Linux.

I'm currently working on getting it fixed here 
https://reviews.llvm.org/D91344#2393371 .
If this hwasan patch series gets approved and if that patch goes in would it be 
feasible to bump the libsanitizer merge to whatever version that would be?

If not (maybe because stage1 would be finished?) then could/would we end up 
using the LOCAL_PATCHES approach?

Thanks,
Matthew

From: Martin Liška 
Sent: 13 November 2020 16:33
To: Matthew Malcomson ; gcc-patches@gcc.gnu.org 

Cc: ja...@redhat.com ; Richard Earnshaw 
; k...@google.com ; do...@redhat.com 
; jos...@codesourcery.com 
Subject: Re: [Patch 0/X] HWASAN v4

On 10/16/20 11:03 AM, Martin Li�ka wrote:
> Hello.
>
> I've just merged libsanitizer and there's the corresponding part that includes
> libhwasan.
>
> Martin

Hey.

I've just made last merge from upstream, there's corresponding hwasan part.

Martin


[Patch] opts: Change `is incompatible with` messages to have standard parametrised form

2020-11-09 Thread Matthew Malcomson via Gcc-patches
Hello,

In a recent review for one of the hwasan patches Richard S. noticed there are
quite a few errors of the form "% is incompatible with
".
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/556137.html

In order to avoid this creating extra work for translators we would like to
change these error messages to use the form "%qs is incompatible with %qs" and
pass the flag as format arguments.

This patch implements that change.
There is only one change in the output the compiler produces from this patch,
an error message of "-fsanitize=address and -fsanitize=kernel-address are
incompatible with -fsanitize=thread" has been changed to "-fsanitize=thread is
incompatible with -fsanitize=address|kernel-address".
This matches the similar error messages for live patching which use the
messages "-f is incompatible with
-flive-patching=inline-only-static|inline-clone".

Bootstrapped and regtested on AArch64 without any problems.
Ok for trunk?

gcc/ChangeLog:

* opts.c (control_options_for_live_patching): Reform 'is incompatible
with' error messages to use a standard message with differing format
arguments.
(finish_options): Likewise.

gcc/testsuite/ChangeLog:

* c-c++-common/ubsan/sanitize-recover-7.c: Update testcase.



### Attachment also inlined for ease of reply###


diff --git a/gcc/opts.c b/gcc/opts.c
index 
96291e89a49dd1cf25a0cacc5a62413d120fa24d..ac9972d9c386247af3482e07a94c76da3e1abb4d
 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -688,30 +688,26 @@ control_options_for_live_patching (struct gcc_options 
*opts,
 {
 case LIVE_PATCHING_INLINE_ONLY_STATIC:
   if (opts_set->x_flag_ipa_cp_clone && opts->x_flag_ipa_cp_clone)
-   error_at (loc,
- "%<-fipa-cp-clone%> is incompatible with "
- "%<-flive-patching=inline-only-static%>");
+   error_at (loc, "%qs is incompatible with %qs",
+ "-fipa-cp-clone", "-flive-patching=inline-only-static");
   else
opts->x_flag_ipa_cp_clone = 0;
 
   if (opts_set->x_flag_ipa_sra && opts->x_flag_ipa_sra)
-   error_at (loc,
- "%<-fipa-sra%> is incompatible with "
- "%<-flive-patching=inline-only-static%>");
+   error_at (loc, "%qs is incompatible with %qs",
+ "-fipa-sra", "-flive-patching=inline-only-static");
   else
opts->x_flag_ipa_sra = 0;
 
   if (opts_set->x_flag_partial_inlining && opts->x_flag_partial_inlining)
-   error_at (loc,
- "%<-fpartial-inlining%> is incompatible with "
- "%<-flive-patching=inline-only-static%>");
+   error_at (loc, "%qs is incompatible with %qs",
+ "-fpartial-inlining", "-flive-patching=inline-only-static");
   else
opts->x_flag_partial_inlining = 0;
 
   if (opts_set->x_flag_ipa_cp && opts->x_flag_ipa_cp)
-   error_at (loc,
- "%<-fipa-cp%> is incompatible with "
- "%<-flive-patching=inline-only-static%>");
+   error_at (loc, "%qs is incompatible with %qs",
+ "-fipa-cp", "-flive-patching=inline-only-static");
   else
opts->x_flag_ipa_cp = 0;
 
@@ -719,9 +715,9 @@ control_options_for_live_patching (struct gcc_options *opts,
 case LIVE_PATCHING_INLINE_CLONE:
   /* live patching should disable whole-program optimization.  */
   if (opts_set->x_flag_whole_program && opts->x_flag_whole_program)
-   error_at (loc,
- "%<-fwhole-program%> is incompatible with "
- "%<-flive-patching=inline-only-static|inline-clone%>");
+   error_at (loc, "%qs is incompatible with %qs",
+ "-fwhole-program",
+ "-flive-patching=inline-only-static|inline-clone");
   else
opts->x_flag_whole_program = 0;
 
@@ -730,65 +726,65 @@ control_options_for_live_patching (struct gcc_options 
*opts,
 && !flag_partial_inlining.  */
 
   if (opts_set->x_flag_ipa_pta && opts->x_flag_ipa_pta)
-   error_at (loc,
- "%<-fipa-pta%> is incompatible with "
- "%<-flive-patching=inline-only-static|inline-clone%>");
+   error_at (loc, "%qs is incompatible with %qs",
+ "-fipa-pta",
+ "-flive-patching=inline-only-static|inline-clone");
   else
opts->x_flag_ipa_pta = 0;
 
   if (opts_set->x_flag_ipa_reference && opts->x_flag_ipa_reference)
-   error_at (loc,
- "%<-fipa-reference%> is incompatible with "
- "%<-flive-patching=inline-only-static|inline-clone%>");
+   error_at (loc, "%qs is incompatible with %qs",
+ "-fipa-reference",
+ "-flive-patching=inline-only-static|inline-clone");
   else
opts->x_flag_ipa_reference = 0;
 
   if (opts_set->x_flag_ipa_ra && opts->x_flag_ipa_ra)
-   error_at (loc,
- "%<-fipa-ra%> is incompatible with "
-

[PATCH 5/X] libsanitizer: mid-end: Introduce stack variable handling for HWASAN

2020-11-03 Thread Matthew Malcomson via Gcc-patches
Hi Richard,

I'm sending up the revised patch 5 (introducing stack variable handling)
without the other changes to other patches.

I figure there's been quite a lot of changes to this patch and I wanted
to give you time to review them while I worked on finishing the less
widespread changes in patch 6 and before I ran the more exhaustive (and
time-consuming) tests in case you didn't like the changes and those
exhaustive tests would just have to get repeated.

The big differences between this and the last version are:

- I moved all helper functions which rely on how the tag is encoded into
  hooks.  This allows backends to choose a different ABI for hwasan
  tagging.  That said, any ABI which doesn't ensure the entire the tag
  is stored as a top byte is unsupported as the library doesn't handle
  anything else.

- I no longer delay emitting RTL to initialise the hwasan_base_pointer.
  It's now emitted as soon as the pointer is used anywhere.

- No longer delay allocating stack variables for hwasan to work.
  Instead we record stack variables when allocating through
  expand_stack_var_1 *and* when allocating through expand_stack_vars.

- Use `frame_offset` in more places to avoid having to manually handle
  the case of `!FRAME_GROWS_DOWNWARDS`.





Handling stack variables has three features.

1) Ensure HWASAN required alignment for stack variables

When tagging shadow memory, we need to ensure that each tag granule is
only used by one variable at a time.

This is done by ensuring that each tagged variable is aligned to the tag
granule representation size and also ensure that the end of each
object is aligned to ensure the start of any other data stored on the
stack is in a different granule.

This patch ensures the above by adding alignment requirements in
`align_local_variable` and forcing the stack pointer to be aligned before
allocating any stack objects.

2) Put tags into each stack variable pointer

Make sure that every pointer to a stack variable includes a tag of some
sort on it.

The way tagging works is:
  1) For every new stack frame, a random tag is generated.
  2) A base register is formed from the stack pointer value and this
 random tag.
  3) References to stack variables are now formed with RTL describing an
 offset from this base in both tag and value.

The random tag generation is handled by a backend hook.  This hook
decides whether to introduce a random tag or use the stack background
based on the parameter hwasan-random-frame-tag.  Using the stack
background is necessary for testing and bootstrap.  It is necessary
during bootstrap to avoid breaking the `configure` test program for
determining stack direction.

Using the stack background means that every stack frame has the initial
tag of zero and variables are tagged with incrementing tags from 1,
which also makes debugging a bit easier.

The tag offsets are also handled by a backend hook.

This patch also adds some macros defining how the HWASAN shadow memory
is stored and how a tag is stored in a pointer.

3) For each stack variable, tag and untag the shadow stack on function
   prologue and epilogue.

On entry to each function we tag the relevant shadow stack region for
each stack variable the tag to match the tag added to each pointer for
that variable.

This is the first patch where we use the HWASAN shadow space, so we need
to add in the libhwasan initialisation code that creates this shadow
memory region into the binary we produce.  This instrumentation is done
in `compile_file`.

When exiting a function we need to ensure the shadow stack for this
function has no remaining tag.  Without clearing the shadow stack area
for this stack frame, later function calls could get false positives
when those later function calls check untagged areas (such as parameters
passed on the stack) against a shadow stack area with left-over tag.

Hence we ensure that the entire stack frame is cleared on function exit.


ChangeLog:

* config/bootstrap-hwasan.mk: Disable random frame tags for
stack-tagging during bootstrap.
* gcc/asan.c (struct hwasan_stack_var): New.
(hwasan_sanitize_p): New.
(hwasan_sanitize_stack_p): New.
(hwasan_sanitize_allocas_p): New.
(initialize_sanitizer_builtins): Define new builtins.
(ATTR_NOTHROW_LIST): New macro.
(hwasan_current_frame_tag): New.
(hwasan_frame_base): New.
(hwasan_record_stack_var): New.
(hwasan_get_frame_extent): New.
(hwasan_increment_frame_tag): New.
(hwasan_record_frame_init): New.
(hwasan_emit_prologue): New.
(hwasan_emit_untag_frame): New.
(hwasan_finish_file): New.
(hwasan_truncate_to_tag_size): New.
* gcc/asan.h (hwasan_record_frame_init): New declaration.
(hwasan_record_stack_var): New declaration.
(hwasan_emit_prologue): New declaration.

Re: [PATCH 1/X] libsanitizer: Tie the hwasan library into our build system

2020-10-28 Thread Matthew Malcomson via Gcc-patches
so I don't think putting much effort into C++ exceptions makes sense right
now.

- LLVM does nothing special with strlen, it just doesn't instrument the call.

- About maybe combining the hwasan pass and the asan pass.
Yes we could just branch the "asan" pass based on hwasan vs asan flags.
I would prefer to have the "hwasan" pass separate (I liked having the
"correct" names for the -fdump-tree* arguments and dump files).
That said -- if you feel strongly about this I'll happily change it.
________
From: Richard Sandiford 
Sent: 13 October 2020 16:57
To: Matthew Malcomson 
Cc: gcc-patches@gcc.gnu.org ; k...@google.com 
; Richard Earnshaw ; 
ja...@redhat.com ; jos...@codesourcery.com 
; dvyu...@google.com ; Kyrylo 
Tkachov ; do...@redhat.com ; Martin 
Liska 
Subject: Re: [PATCH 1/X] libsanitizer: Tie the hwasan library into our build 
system

Sorry for the slow review.

Matthew Malcomson  writes:
> This patch tries to tie libhwasan into the GCC build system in the same way
> that the other sanitizer runtime libraries are handled.
>
> libsanitizer/ChangeLog:
>
>* Makefile.am:  Build libhwasan.
>* Makefile.in:  Build libhwasan.
>* asan/Makefile.in:  Build libhwasan.
>* configure:  Build libhwasan.
>* configure.ac:  Build libhwasan.
>* hwasan/Makefile.am: New file.
>* hwasan/Makefile.in: New file.
>* hwasan/libtool-version: New file.
>* interception/Makefile.in: Build libhwasan.
>* libbacktrace/Makefile.in: Build libhwasan.
>* libsanitizer.spec.in: Build libhwasan.
>* lsan/Makefile.in: Build libhwasan.
>* sanitizer_common/Makefile.in: Build libhwasan.
>* tsan/Makefile.in: Build libhwasan.
>* ubsan/Makefile.in: Build libhwasan.

I think this should also update README.gcc and merge.sh.  Could you
try locally merging in a dummy change to the llvm sources with merge.sh,
to make sure it works correctly?

> new file mode 100644
> index 
> ..aaa39b4536a5c5f54910a951470814bbc8a20946
> --- /dev/null
> +++ b/libsanitizer/hwasan/Makefile.am
> @@ -0,0 +1,88 @@
> +AM_CPPFLAGS = -I $(top_srcdir)/include -I $(top_srcdir)
> +
> +# May be used by toolexeclibdir.
> +gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
> +
> +DEFS = -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS 
> -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -DCAN_SANITIZE_UB=0 
> -DHWASAN_WITH_INTERCEPTORS=1
> +AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic 
> -Wno-long-long  -fPIC -fno-builtin -fno-exceptions -fno-rtti 
> -fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros 
> -fno-ipa-icf

I realise this is just taken from the other Makefile.ams, but do
you know the reason behind -fomit-frame-pointer?  I think we should
avoid building aarch64 libraries with that flag unless there's a
specific reason.

Otherwise looks good to me, although I'm definitely not an expert
on this stuff.

Thanks,
Richard


[PATCH 4/X] libsanitizer: options: Add hwasan flags and argument parsing

2020-08-17 Thread Matthew Malcomson
These flags can't be used at the same time as any of the other
sanitizers.
We add an equivalent flag to -static-libasan in -static-libhwasan to
ensure static linking.

The -fsanitize=kernel-hwaddress option is for compiling targeting the
kernel.  This flag has defaults that allow compiling KASAN with tags as
it is currently implemented.
These defaults are that we do not sanitize variables on the stack and
always recover from a detected bug.
Stack tagging in the kernel is a future aim, I don't know of any reason
it would not work, but this has not yet been tested.

We introduce a backend hook `targetm.memtag.can_tag_addresses` that
indicates to the mid-end whether a target has a feature like AArch64 TBI
where the top byte of an address is ignored.
Without this feature hwasan sanitization is not done.

gcc/ChangeLog:

* common.opt (flag_sanitize_recover): Default for kernel
hwaddress.
(static-libhwasan): New cli option.
* config/aarch64/aarch64.c (aarch64_can_tag_addresses): New.
(TARGET_MEMTAG_CAN_TAG_ADDRESSES): New.
* config/gnu-user.h (LIBHWASAN_EARLY_SPEC): hwasan equivalent of
asan command line flags.
* cppbuiltin.c (define_builtin_macros_for_compilation_flags):
Add hwasan equivalent of __SANITIZE_ADDRESS__.
* doc/invoke.texi: Document hwasan command line flags.
* doc/tm.texi: Document new hook.
* doc/tm.texi.in: Document new hook.
* flag-types.h (enum sanitize_code): New sanitizer values.
* gcc.c (STATIC_LIBHWASAN_LIBS): New macro.
(LIBHWASAN_SPEC): New macro.
(LIBHWASAN_EARLY_SPEC): New macro.
(SANITIZER_EARLY_SPEC): Update to include hwasan.
(SANITIZER_SPEC): Update to include hwasan.
(sanitize_spec_function): Use hwasan options.
* opts.c (finish_options): Describe conflicts between address
sanitizers.
(sanitizer_opts): Introduce new sanitizer flags.
(common_handle_option): Add defaults for kernel sanitizer.
* params.opt (hwasan--instrument-stack): New
(hwasan-random-frame-tag): New
(hwasan-instrument-allocas): New
(hwasan-instrument-reads): New
(hwasan-instrument-writes): New
(hwasan-instrument-mem-intrinsics): New
* target.def (HOOK_PREFIX): Add new hook.
(can_tag_addresses): Add new hook under memtag prefix.
* targhooks.c (default_memtag_can_tag_addresses): New.
* targhooks.h (default_memtag_can_tag_addresses): New decl.
* toplev.c (process_options): Ensure hwasan only on TBI
architectures.

gcc/c-family/ChangeLog:

* c-attribs.c (handle_no_sanitize_hwaddress_attribute): New
attribute.



### Attachment also inlined for ease of reply###


diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 
372148315389db6671dfd943fd1a68670fcb1cbc..f8bf165aa48b5709c26f4e8245e5ab929b44fca6
 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -54,6 +54,8 @@ static tree handle_cold_attribute (tree *, tree, tree, int, 
bool *);
 static tree handle_no_sanitize_attribute (tree *, tree, tree, int, bool *);
 static tree handle_no_sanitize_address_attribute (tree *, tree, tree,
  int, bool *);
+static tree handle_no_sanitize_hwaddress_attribute (tree *, tree, tree,
+   int, bool *);
 static tree handle_no_sanitize_thread_attribute (tree *, tree, tree,
 int, bool *);
 static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree,
@@ -412,6 +414,8 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_no_sanitize_attribute, NULL },
   { "no_sanitize_address",0, 0, true, false, false, false,
  handle_no_sanitize_address_attribute, NULL },
+  { "no_sanitize_hwaddress",0, 0, true, false, false, false,
+ handle_no_sanitize_hwaddress_attribute, NULL },
   { "no_sanitize_thread", 0, 0, true, false, false, false,
  handle_no_sanitize_thread_attribute, NULL },
   { "no_sanitize_undefined",  0, 0, true, false, false, false,
@@ -946,6 +950,22 @@ handle_no_sanitize_address_attribute (tree *node, tree 
name, tree, int,
   return NULL_TREE;
 }
 
+/* Handle a "no_sanitize_hwaddress" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_no_sanitize_hwaddress_attribute (tree *node, tree name, tree, int,
+ bool *no_add_attrs)
+{
+  *no_add_attrs = true;
+  if (TREE_CODE (*node) != FUNCTION_DECL)
+warning (OPT_Wattributes, "%qE attribute ignored", name);
+  else
+add_no_sanitize_value (*node, SANITIZE_HWADDRESS);
+
+  return NULL_TREE;
+}
+
 /* Handle a "no_sanitize_thread" attribute; arguments as in
struct 

[PATCH 6/X] libsanitizer: Add hwasan pass and associated gimple changes

2020-08-17 Thread Matthew Malcomson
There are four main features to this change:

1) Check pointer tags match address tags.

In the new `hwasan` pass we put HWASAN_CHECK internal functions before
all memory accesses to check that tags in the pointer being used match
the tag stored in shadow memory for the memory region being used.

These internal functions are expanded into actual checks in the sanopt
pass that happens just before expansion into RTL.

We use the same mechanism that currently inserts ASAN_CHECK internal
functions to insert the new HWASAN_CHECK functions.

2) Instrument known builtin function calls.

Handle all builtin functions that we know use memory accesses.
This commit uses the machinery added for ASAN to identify builtin
functions that access memory.

The main differences between the approaches for HWASAN and ASAN are:
 - libhwasan intercepts much less builtin functions.
 - Alloca needs to be transformed differently (instead of adding
   redzones it needs to tag shadow memory and return a tagged pointer).
 - stack_restore needs to untag the shadow stack between the current
   position and where it's going.
 - `noreturn` functions can not be handled by simply unpoisoning the
   entire shadow stack -- there is no "always valid" tag.
   (exceptions and things such as longjmp need to be handled in a
   different way).

For hardware implemented checking (such as AArch64's memory tagging
extension) alloca and stack_restore will need to be handled by hooks in
the backend rather than transformation at the gimple level.  This will
allow architecture specific handling of such stack modifications.

3) Introduce HWASAN block-scope poisoning

Here we use exactly the same mechanism as ASAN_MARK to poison/unpoison
variables on entry/exit of a block.

In order to simply use the exact same machinery we're using the same
internal functions until the SANOPT pass.  This means that all handling
of ASAN_MARK is the same.
This has the negative that the naming may be a little confusing, but a
positive that handling of the internal function doesn't have to be
duplicated for a function that behaves exactly the same but has a
different name.

gcc/ChangeLog:

* asan.c (asan_instrument_reads): New.
(asan_instrument_writes): New.
(asan_memintrin): New.
(handle_builtin_stack_restore): Account for HWASAN.
(handle_builtin_alloca): Account for HWASAN.
(get_mem_refs_of_builtin_call): Special case strlen for HWASAN.
(report_error_func): Assert not HWASAN.
(build_check_stmt): Make HWASAN_CHECK instead of ASAN_CHECK.
(instrument_derefs): HWASAN does not tag globals.
(instrument_builtin_call): Use new helper functions.
(maybe_instrument_call): Don't instrument `noreturn` functions.
(initialize_sanitizer_builtins): Add new type.
(asan_expand_mark_ifn): Account for HWASAN.
(asan_expand_check_ifn): Assert never called by HWASAN.
(asan_expand_poison_ifn): Account for HWASAN.
(hwasan_instrument_reads): New.
(hwasan_instrument_writes): New.
(hwasan_memintrin): New.
(hwasan_instrument): New.
(hwasan_base): New.
(hwasan_check_func): New.
(hwasan_expand_check_ifn): New.
(hwasan_expand_mark_ifn): New.
(gate_hwasan): New.
(class pass_hwasan): New.
(make_pass_hwasan): New.
(class pass_hwasan_O0): New.
(make_pass_hwasan_O0): New.
* asan.h (hwasan_base): New decl.
(hwasan_expand_check_ifn): New decl.
(hwasan_expand_mark_ifn): New decl.
(gate_hwasan): New decl.
(enum hwasan_mark_flags): New.
(asan_intercepted_p): Always false for hwasan.
(asan_sanitize_use_after_scope): Account for HWASAN.
* builtin-types.def (BT_FN_PTR_CONST_PTR_UINT8): New.
* gimple-pretty-print.c (dump_gimple_call_args): Account for
HWASAN.
* gimplify.c (asan_poison_variable): Account for HWASAN.
(gimplify_function_tree): Remove requirement of
SANITIZE_ADDRESS, requiring asan or hwasan is accounted for in
`asan_sanitize_use_after_scope`.
* internal-fn.c (expand_HWASAN_CHECK): New.
(expand_HWASAN_CHOOSE_TAG): New.
(expand_HWASAN_MARK): New.
(expand_HWASAN_ALLOCA_UNPOISON): New.
* internal-fn.def (HWASAN_CHOOSE_TAG): New.
(HWASAN_CHECK): New.
(HWASAN_MARK): New.
(HWASAN_ALLOCA_UNPOISON): New.
* passes.def: Add hwasan and hwasan_O0 passes.
* sanitizer.def (BUILT_IN_HWASAN_LOAD1): New.
(BUILT_IN_HWASAN_LOAD2): New.
(BUILT_IN_HWASAN_LOAD4): New.
(BUILT_IN_HWASAN_LOAD8): New.
(BUILT_IN_HWASAN_LOAD16): New.
(BUILT_IN_HWASAN_LOADN): New.
(BUILT_IN_HWASAN_STORE1): New.
(BUILT_IN_HWASAN_STORE2): New.
(BUILT_IN_HWASAN_STORE4): New.
(BUILT_IN_HWASAN_STORE8): New.
(BUILT_IN_HWASAN_STORE16): New.
(BUILT_IN_HWASAN_STOREN): 

[PATCH 5/X] libsanitizer: mid-end: Introduce stack variable handling for HWASAN

2020-08-17 Thread Matthew Malcomson
Handling stack variables has three features.

1) Ensure HWASAN required alignment for stack variables

When tagging shadow memory, we need to ensure that each tag granule is
only used by one variable at a time.

This is done by ensuring that each tagged variable is aligned to the tag
granule representation size and also ensure that the end of each
object is aligned to ensure the start of any other data stored on the
stack is in a different granule.

This patch ensures the above by adding alignment requirements in
`align_local_variable` and forcing all stack variable allocation to be
deferred so that `expand_stack_vars` can ensure the stack pointer is
aligned before allocating any variable for the current frame.

2) Put tags into each stack variable pointer

Make sure that every pointer to a stack variable includes a tag of some
sort on it.

The way tagging works is:
  1) For every new stack frame, a random tag is generated.
  2) A base register is formed from the stack pointer value and this
 random tag.
  3) References to stack variables are now formed with RTL describing an
 offset from this base in both tag and value.

The random tag generation is handled by a backend hook.  This hook
decides whether to introduce a random tag or use the stack background
based on the parameter hwasan-random-frame-tag.  Using the stack
background is necessary for testing and bootstrap.  It is necessary
during bootstrap to avoid breaking the `configure` test program for
determining stack direction.

Using the stack background means that every stack frame has the initial
tag of zero and variables are tagged with incrementing tags from 1,
which also makes debugging a bit easier.

The tag offsets are also handled by a backend hook.

This patch also adds some macros defining how the HWASAN shadow memory
is stored and how a tag is stored in a pointer.

3) For each stack variable, tag and untag the shadow stack on function
   prologue and epilogue.

On entry to each function we tag the relevant shadow stack region for
each stack variable the tag to match the tag added to each pointer for
that variable.

This is the first patch where we use the HWASAN shadow space, so we need
to add in the libhwasan initialisation code that creates this shadow
memory region into the binary we produce.  This instrumentation is done
in `compile_file`.

When exiting a function we need to ensure the shadow stack for this
function has no remaining tag.  Without clearing the shadow stack area
for this stack frame, later function calls could get false positives
when those later function calls check untagged areas (such as parameters
passed on the stack) against a shadow stack area with left-over tag.

Hence we ensure that the entire stack frame is cleared on function exit.

config/ChangeLog:

* bootstrap-hwasan.mk: Disable random frame tags for
stack-tagging during bootstrap.

gcc/ChangeLog:

* asan.c (hwasan_record_base): New function.
(hwasan_emit_untag_frame): New.
(hwasan_increment_tag): New function.
(hwasan_with_tag): New function.
(hwasan_tag_init): New function.
(initialize_sanitizer_builtins): Define new builtins.
(ATTR_NOTHROW_LIST): New macro.
(hwasan_current_tag): New.
(hwasan_extract_tag): New.
(hwasan_emit_prologue): New.
(hwasan_create_untagged_base): New.
(hwasan_finish_file): New.
(hwasan_ctor_statements): New variable.
(hwasan_sanitize_stack_p): New.
(hwasan_sanitize_p): New.
(hwasan_sanitize_allocas_p): New.
* asan.h (hwasan_record_base): New declaration.
(hwasan_emit_untag_frame): New.
(hwasan_increment_tag): New declaration.
(hwasan_with_tag): New declaration.
(hwasan_sanitize_stack_p): New declaration.
(hwasan_sanitize_allocas_p): New declaration.
(hwasan_tag_init): New declaration.
(hwasan_sanitize_p): New declaration.
(HWASAN_TAG_SIZE): New macro.
(HWASAN_TAG_GRANULE_SIZE): New macro.
(HWASAN_TAG_SHIFT_SIZE): New macro.
(HWASAN_SHIFT): New macro.
(HWASAN_SHIFT_RTX): New macro.
(HWASAN_STACK_BACKGROUND): New macro.
(hwasan_finish_file): New declaration.
(hwasan_current_tag): New declaration.
(hwasan_create_untagged_base): New declaration.
(hwasan_extract_tag): New declaration.
(hwasan_emit_prologue): New declaration.
* cfgexpand.c (struct stack_vars_data): Add information to
record hwasan variable stack offsets.
(expand_stack_vars): Ensure variables are offset from a tagged
base. Record offsets for hwasan. Ensure alignment.
(expand_used_vars): Call function to emit prologue, and get
untagging instructions for function exit.
(align_local_variable): Ensure alignment.
(defer_stack_allocation): Ensure all variables are deferred so
they can be handled by 

[PATCH 7/X] libsanitizer: Add tests

2020-08-17 Thread Matthew Malcomson
Adding hwasan tests.

Only interesting thing here is that we have to make sure the tagging mechanism
is deterministic to avoid flaky tests.

gcc/testsuite/ChangeLog:

* c-c++-common/hwasan/aligned-alloc.c: New test.
* c-c++-common/hwasan/alloca-array-accessible.c: New test.
* c-c++-common/hwasan/alloca-gets-different-tag.c: New test.
* c-c++-common/hwasan/alloca-outside-caught.c: New test.
* c-c++-common/hwasan/arguments.c: New test.
* c-c++-common/hwasan/arguments-1.c: New test.
* c-c++-common/hwasan/arguments-2.c: New test.
* c-c++-common/hwasan/arguments-3.c: New test.
* c-c++-common/hwasan/asan-pr63316.c: New test.
* c-c++-common/hwasan/asan-pr70541.c: New test.
* c-c++-common/hwasan/asan-pr78106.c: New test.
* c-c++-common/hwasan/asan-pr79944.c: New test.
* c-c++-common/hwasan/asan-rlimit-mmap-test-1.c: New test.
* c-c++-common/hwasan/bitfield-1.c: New test.
* c-c++-common/hwasan/bitfield-2.c: New test.
* c-c++-common/hwasan/builtin-special-handling.c: New test.
* c-c++-common/hwasan/check-interface.c: New test.
* c-c++-common/hwasan/halt_on_error-1.c: New test.
* c-c++-common/hwasan/heap-overflow.c: New test.
* c-c++-common/hwasan/hwasan-poison-optimisation.c: New test.
* c-c++-common/hwasan/hwasan-thread-access-parent.c: New test.
* c-c++-common/hwasan/hwasan-thread-basic-failure.c: New test.
* c-c++-common/hwasan/hwasan-thread-clears-stack.c: New test.
* c-c++-common/hwasan/hwasan-thread-success.c: New test.
* c-c++-common/hwasan/kernel-defaults.c: New test.
* c-c++-common/hwasan/large-aligned-0.c: New test.
* c-c++-common/hwasan/large-aligned-1.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-0.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-1.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-2.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-3.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-4.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-5.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-6.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-7.c: New test.
* c-c++-common/hwasan/macro-definition.c: New test.
* c-c++-common/hwasan/no-sanitize-attribute.c: New test.
* c-c++-common/hwasan/param-instrument-reads-and-writes.c: New test.
* c-c++-common/hwasan/param-instrument-reads.c: New test.
* c-c++-common/hwasan/param-instrument-writes.c: New test.
* c-c++-common/hwasan/param-instrument-mem-intrinsics.c: New test.
* c-c++-common/hwasan/random-frame-tag.c: New test.
* c-c++-common/hwasan/sanity-check-pure-c.c: New test.
* c-c++-common/hwasan/setjmp-longjmp-0.c: New test.
* c-c++-common/hwasan/setjmp-longjmp-1.c: New test.
* c-c++-common/hwasan/stack-tagging-basic-0.c: New test.
* c-c++-common/hwasan/stack-tagging-basic-1.c: New test.
* c-c++-common/hwasan/stack-tagging-disable.c: New test.
* c-c++-common/hwasan/unprotected-allocas-0.c: New test.
* c-c++-common/hwasan/unprotected-allocas-1.c: New test.
* c-c++-common/hwasan/use-after-free.c: New test.
* c-c++-common/hwasan/vararray-outside-caught.c: New test.
* c-c++-common/hwasan/vararray-stack-restore-correct.c: New test.
* c-c++-common/hwasan/very-large-objects.c: New test.
* g++.dg/hwasan/hwasan.exp: New file.
* g++.dg/hwasan/rvo-handled.C: New test.
* gcc.dg/hwasan/hwasan.exp: New file.
* gcc.dg/hwasan/nested-functions-0.c: New test.
* gcc.dg/hwasan/nested-functions-1.c: New test.
* gcc.dg/hwasan/nested-functions-2.c: New test.
* lib/hwasan-dg.exp: New file.


hwasan-diff6.patch.gz
Description: application/gzip


[PATCH 1/X] libsanitizer: Tie the hwasan library into our build system

2020-08-17 Thread Matthew Malcomson
This patch tries to tie libhwasan into the GCC build system in the same way
that the other sanitizer runtime libraries are handled.

libsanitizer/ChangeLog:

* Makefile.am:  Build libhwasan.
* Makefile.in:  Build libhwasan.
* asan/Makefile.in:  Build libhwasan.
* configure:  Build libhwasan.
* configure.ac:  Build libhwasan.
* hwasan/Makefile.am: New file.
* hwasan/Makefile.in: New file.
* hwasan/libtool-version: New file.
* interception/Makefile.in: Build libhwasan.
* libbacktrace/Makefile.in: Build libhwasan.
* libsanitizer.spec.in: Build libhwasan.
* lsan/Makefile.in: Build libhwasan.
* sanitizer_common/Makefile.in: Build libhwasan.
* tsan/Makefile.in: Build libhwasan.
* ubsan/Makefile.in: Build libhwasan.



### Attachment also inlined for ease of reply###


diff --git a/libsanitizer/Makefile.am b/libsanitizer/Makefile.am
index 
65ed1e712378ef453f820f86c4d3221f9dee5f2c..2a7e8e1debe838719db0f0fad218b2543cc3111b
 100644
--- a/libsanitizer/Makefile.am
+++ b/libsanitizer/Makefile.am
@@ -14,11 +14,12 @@ endif
 if LIBBACKTRACE_SUPPORTED
 SUBDIRS += libbacktrace
 endif
-SUBDIRS += lsan asan ubsan
+SUBDIRS += lsan asan ubsan hwasan
 nodist_saninclude_HEADERS += \
   include/sanitizer/lsan_interface.h \
   include/sanitizer/asan_interface.h \
-  include/sanitizer/tsan_interface.h
+  include/sanitizer/tsan_interface.h \
+  include/sanitizer/hwasan_interface.h
 if TSAN_SUPPORTED
 SUBDIRS += tsan
 endif
diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in
index 
02c7f70ac6578a3e93a490ce8bd2c54fc0693c50..2c57d49cbffdb486645aeb5f2c0f85d6e0fad124
 100644
--- a/libsanitizer/Makefile.in
+++ b/libsanitizer/Makefile.in
@@ -92,7 +92,8 @@ target_triplet = @target@
 @SANITIZER_SUPPORTED_TRUE@am__append_1 = 
include/sanitizer/common_interface_defs.h \
 @SANITIZER_SUPPORTED_TRUE@ include/sanitizer/lsan_interface.h \
 @SANITIZER_SUPPORTED_TRUE@ include/sanitizer/asan_interface.h \
-@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/tsan_interface.h
+@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/tsan_interface.h \
+@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/hwasan_interface.h
 @SANITIZER_SUPPORTED_TRUE@@USING_MAC_INTERPOSE_FALSE@am__append_2 = 
interception
 @LIBBACKTRACE_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_3 = 
libbacktrace
 @SANITIZER_SUPPORTED_TRUE@@TSAN_SUPPORTED_TRUE@am__append_4 = tsan
@@ -207,7 +208,7 @@ ETAGS = etags
 CTAGS = ctags
 CSCOPE = cscope
 DIST_SUBDIRS = sanitizer_common interception libbacktrace lsan asan \
-   ubsan tsan
+   ubsan hwasan tsan
 ACLOCAL = @ACLOCAL@
 ALLOC_FILE = @ALLOC_FILE@
 AMTAR = @AMTAR@
@@ -329,6 +330,7 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libasan = @link_libasan@
+link_libhwasan = @link_libhwasan@
 link_liblsan = @link_liblsan@
 link_libtsan = @link_libtsan@
 link_libubsan = @link_libubsan@
@@ -362,7 +364,7 @@ sanincludedir = 
$(libdir)/gcc/$(target_alias)/$(gcc_version)/include/sanitizer
 nodist_saninclude_HEADERS = $(am__append_1)
 @SANITIZER_SUPPORTED_TRUE@SUBDIRS = sanitizer_common $(am__append_2) \
 @SANITIZER_SUPPORTED_TRUE@ $(am__append_3) lsan asan ubsan \
-@SANITIZER_SUPPORTED_TRUE@ $(am__append_4)
+@SANITIZER_SUPPORTED_TRUE@ hwasan $(am__append_4)
 gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
diff --git a/libsanitizer/asan/Makefile.in b/libsanitizer/asan/Makefile.in
index 
29622bf466a37f819c9fade30e31195adda51190..25c7fd7b7597d6e243005a1bb7de5b6243d2cfcf
 100644
--- a/libsanitizer/asan/Makefile.in
+++ b/libsanitizer/asan/Makefile.in
@@ -383,6 +383,7 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libasan = @link_libasan@
+link_libhwasan = @link_libhwasan@
 link_liblsan = @link_liblsan@
 link_libtsan = @link_libtsan@
 link_libubsan = @link_libubsan@
diff --git a/libsanitizer/configure b/libsanitizer/configure
index 
04eca04fbe5e59bae1ba00597de0cf1b7cf1b5fa..9ed9669a85d3cfc2f2f623e796e61a5f8f7e4ded
 100755
--- a/libsanitizer/configure
+++ b/libsanitizer/configure
@@ -657,6 +657,7 @@ USING_MAC_INTERPOSE_TRUE
 link_liblsan
 link_libubsan
 link_libtsan
+link_libhwasan
 link_libasan
 LSAN_SUPPORTED_FALSE
 LSAN_SUPPORTED_TRUE
@@ -12361,7 +12362,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12364 "configure"
+#line 12365 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -12467,7 +12468,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12470 "configure"
+#line 12471 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -15943,6 +15944,10 @@ fi
 link_libasan=$link_sanitizer_common
 
 
+# Set up the set of additional 

[PATCH 2/X] libsanitizer: Only build libhwasan when targeting AArch64

2020-08-17 Thread Matthew Malcomson
Though the library has limited support for x86, we don't have any
support for generating code targeting x86 so there is no point building
for that target.

libsanitizer/ChangeLog:

* Makefile.am: Condition building hwasan directory.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac: Set HWASAN_SUPPORTED based on target
architecture.
* configure.tgt: Likewise.



### Attachment also inlined for ease of reply###


diff --git a/libsanitizer/Makefile.am b/libsanitizer/Makefile.am
index 
2a7e8e1debe838719db0f0fad218b2543cc3111b..065a65e78d49f7689a01ecb64db1f07ca83aa987
 100644
--- a/libsanitizer/Makefile.am
+++ b/libsanitizer/Makefile.am
@@ -14,7 +14,7 @@ endif
 if LIBBACKTRACE_SUPPORTED
 SUBDIRS += libbacktrace
 endif
-SUBDIRS += lsan asan ubsan hwasan
+SUBDIRS += lsan asan ubsan
 nodist_saninclude_HEADERS += \
   include/sanitizer/lsan_interface.h \
   include/sanitizer/asan_interface.h \
@@ -23,6 +23,9 @@ nodist_saninclude_HEADERS += \
 if TSAN_SUPPORTED
 SUBDIRS += tsan
 endif
+if HWASAN_SUPPORTED
+SUBDIRS += hwasan
+endif
 endif
 
 ## May be used by toolexeclibdir.
diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in
index 
2c57d49cbffdb486645aeb5f2c0f85d6e0fad124..3873ea4d7050f04a3f7bbd0dd3f2a71e9b65d287
 100644
--- a/libsanitizer/Makefile.in
+++ b/libsanitizer/Makefile.in
@@ -97,6 +97,7 @@ target_triplet = @target@
 @SANITIZER_SUPPORTED_TRUE@@USING_MAC_INTERPOSE_FALSE@am__append_2 = 
interception
 @LIBBACKTRACE_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_3 = 
libbacktrace
 @SANITIZER_SUPPORTED_TRUE@@TSAN_SUPPORTED_TRUE@am__append_4 = tsan
+@HWASAN_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_5 = hwasan
 subdir = .
 ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
 am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \
@@ -208,7 +209,7 @@ ETAGS = etags
 CTAGS = ctags
 CSCOPE = cscope
 DIST_SUBDIRS = sanitizer_common interception libbacktrace lsan asan \
-   ubsan hwasan tsan
+   ubsan tsan hwasan
 ACLOCAL = @ACLOCAL@
 ALLOC_FILE = @ALLOC_FILE@
 AMTAR = @AMTAR@
@@ -364,7 +365,7 @@ sanincludedir = 
$(libdir)/gcc/$(target_alias)/$(gcc_version)/include/sanitizer
 nodist_saninclude_HEADERS = $(am__append_1)
 @SANITIZER_SUPPORTED_TRUE@SUBDIRS = sanitizer_common $(am__append_2) \
 @SANITIZER_SUPPORTED_TRUE@ $(am__append_3) lsan asan ubsan \
-@SANITIZER_SUPPORTED_TRUE@ hwasan $(am__append_4)
+@SANITIZER_SUPPORTED_TRUE@ $(am__append_4) $(am__append_5)
 gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
diff --git a/libsanitizer/configure b/libsanitizer/configure
index 
9ed9669a85d3cfc2f2f623e796e61a5f8f7e4ded..cc5c229f4aebcdd454e9e2e415a8e16046dc1b1a
 100755
--- a/libsanitizer/configure
+++ b/libsanitizer/configure
@@ -659,6 +659,8 @@ link_libubsan
 link_libtsan
 link_libhwasan
 link_libasan
+HWASAN_SUPPORTED_FALSE
+HWASAN_SUPPORTED_TRUE
 LSAN_SUPPORTED_FALSE
 LSAN_SUPPORTED_TRUE
 TSAN_SUPPORTED_FALSE
@@ -12362,7 +12364,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12365 "configure"
+#line 12367 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -12468,7 +12470,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12471 "configure"
+#line 12473 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -15819,6 +15821,7 @@ fi
 # Get target configury.
 unset TSAN_SUPPORTED
 unset LSAN_SUPPORTED
+unset HWASAN_SUPPORTED
 . ${srcdir}/configure.tgt
  if test "x$TSAN_SUPPORTED" = "xyes"; then
   TSAN_SUPPORTED_TRUE=
@@ -15836,6 +15839,14 @@ else
   LSAN_SUPPORTED_FALSE=
 fi
 
+ if test "x$HWASAN_SUPPORTED" = "xyes"; then
+  HWASAN_SUPPORTED_TRUE=
+  HWASAN_SUPPORTED_FALSE='#'
+else
+  HWASAN_SUPPORTED_TRUE='#'
+  HWASAN_SUPPORTED_FALSE=
+fi
+
 
 # Check for functions needed.
 for ac_func in clock_getres clock_gettime clock_settime lstat readlink
@@ -16818,7 +16829,7 @@ ac_config_files="$ac_config_files Makefile 
libsanitizer.spec libbacktrace/backtr
 ac_config_headers="$ac_config_headers config.h"
 
 
-ac_config_files="$ac_config_files interception/Makefile 
sanitizer_common/Makefile libbacktrace/Makefile lsan/Makefile asan/Makefile 
hwasan/Makefile ubsan/Makefile"
+ac_config_files="$ac_config_files interception/Makefile 
sanitizer_common/Makefile libbacktrace/Makefile lsan/Makefile asan/Makefile 
ubsan/Makefile"
 
 
 if test "x$TSAN_SUPPORTED" = "xyes"; then
@@ -16826,6 +16837,11 @@ if test "x$TSAN_SUPPORTED" = "xyes"; then
 
 fi
 
+if test "x$HWASAN_SUPPORTED" = "xyes"; then
+  ac_config_files="$ac_config_files hwasan/Makefile"
+
+fi
+
 
 
 
@@ -17090,6 +17106,10 @@ if test -z "${LSAN_SUPPORTED_TRUE}" && test -z 
"${LSAN_SUPPORTED_FALSE}"; then
   as_fn_error $? "conditional \"LSAN_SUPPORTED\" was never defined.
 Usually 

[PATCH 3/X] libsanitizer: Add option to bootstrap using HWASAN

2020-08-17 Thread Matthew Malcomson
This is an analogous option to --bootstrap-asan to configure.  It allows
bootstrapping GCC using HWASAN.

For the same reasons as for ASAN we have to avoid using the HWASAN
sanitizer when compiling libiberty and the lto-plugin.

Also add a function to query whether -fsanitize=hwaddress has been
passed.

ChangeLog:

* configure: Regenerate.
* configure.ac: Add --bootstrap-hwasan option.

config/ChangeLog:

* bootstrap-hwasan.mk: New file.

gcc/ChangeLog:

* doc/install.texi: Document new option.

libiberty/ChangeLog:

* configure: Regenerate.
* configure.ac: Avoid using sanitizer.

lto-plugin/ChangeLog:

* Makefile.am: Avoid using sanitizer.
* Makefile.in: Regenerate.



### Attachment also inlined for ease of reply###


diff --git a/config/bootstrap-hwasan.mk b/config/bootstrap-hwasan.mk
new file mode 100644
index 
..4f60bed3fd6e98b47a3a38aea6eba2a7c320da25
--- /dev/null
+++ b/config/bootstrap-hwasan.mk
@@ -0,0 +1,8 @@
+# This option enables -fsanitize=hwaddress for stage2 and stage3.
+
+STAGE2_CFLAGS += -fsanitize=hwaddress
+STAGE3_CFLAGS += -fsanitize=hwaddress
+POSTSTAGE1_LDFLAGS += -fsanitize=hwaddress -static-libhwasan \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/ \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/ \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/.libs
diff --git a/configure b/configure
index 
a0c5aca9e8d5cae2782c8fe4625a501853dc226a..203319e3f899e8d24429950c3a5d22927fb5150f
 100755
--- a/configure
+++ b/configure
@@ -8297,7 +8297,7 @@ fi
 # or bootstrap-ubsan, bootstrap it.
 if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; then
   case "$BUILD_CONFIG" in
-*bootstrap-asan* | *bootstrap-ubsan* )
+*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* )
   bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer,
   bootstrap_fixincludes=yes
   ;;
diff --git a/configure.ac b/configure.ac
index 
1a53ed418e4d97606356b14a17b50186c79adcd3..9d5c187c31bfc01003e75058896b686807e47643
 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2809,7 +2809,7 @@ fi
 # or bootstrap-ubsan, bootstrap it.
 if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; then
   case "$BUILD_CONFIG" in
-*bootstrap-asan* | *bootstrap-ubsan* )
+*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* )
   bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer,
   bootstrap_fixincludes=yes
   ;;
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 
d581a34653f61a440b3c3b832836fe109e2fbd08..25d041fcbb1f7c16f7ac47b7b5d4ea8308c6f69c
 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2767,6 +2767,11 @@ the build tree.
 Compiles GCC itself using Address Sanitization in order to catch invalid memory
 accesses within the GCC code.
 
+@item @samp{bootstrap-hwasan}
+Compiles GCC itself using HWAddress Sanitization in order to catch invalid
+memory accesses within the GCC code.  This option is only available on AArch64
+targets running a Linux kernel that supports the required ABI (5.4 or later).
+
 @end table
 
 @section Building a cross compiler
diff --git a/libiberty/configure b/libiberty/configure
index 
1f8e23f0d235a6a5d5158bf6705023db95ac7023..59e0b73d5838bbd42a5548759084471e97ec254f
 100755
--- a/libiberty/configure
+++ b/libiberty/configure
@@ -5264,6 +5264,7 @@ fi
 NOASANFLAG=
 case " ${CFLAGS} " in
   *\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;;
+  *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;;
 esac
 
 
diff --git a/libiberty/configure.ac b/libiberty/configure.ac
index 
4e2599c14a89bafcb8c7e523b9ce5b3d60b8c0f6..ad952963971a31968b5d109661b9cab0aa4b95fc
 100644
--- a/libiberty/configure.ac
+++ b/libiberty/configure.ac
@@ -240,6 +240,7 @@ AC_SUBST(PICFLAG)
 NOASANFLAG=
 case " ${CFLAGS} " in
   *\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;;
+  *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;;
 esac
 AC_SUBST(NOASANFLAG)
 
diff --git a/lto-plugin/Makefile.am b/lto-plugin/Makefile.am
index 
ba5882df7a7272f65219191c82ecd78ab4d3725e..50d6e09dac881d28d4ff70def47b09ed8c0ea66c
 100644
--- a/lto-plugin/Makefile.am
+++ b/lto-plugin/Makefile.am
@@ -11,8 +11,8 @@ AM_CPPFLAGS = -I$(top_srcdir)/../include $(DEFS)
 AM_CFLAGS = @ac_lto_plugin_warn_cflags@ $(CET_HOST_FLAGS)
 AM_LDFLAGS = @ac_lto_plugin_ldflags@
 AM_LIBTOOLFLAGS = --tag=disable-static
-override CFLAGS := $(filter-out -fsanitize=address,$(CFLAGS))
-override LDFLAGS := $(filter-out -fsanitize=address,$(LDFLAGS))
+override CFLAGS := $(filter-out -fsanitize=address 
-fsanitize=hwaddress,$(CFLAGS))
+override LDFLAGS := $(filter-out -fsanitize=address 
-fsanitize=hwaddress,$(LDFLAGS))
 
 libexecsub_LTLIBRARIES = liblto_plugin.la
 gcc_build_dir = @gcc_build_dir@
diff --git 

Re: SLS Mitigation patches backported for GCC9

2020-07-24 Thread Matthew Malcomson

On 24/07/2020 12:01, Kyrylo Tkachov wrote:

Hi Matthew,


-Original Message-
From: Matthew Malcomson 
Sent: 21 July 2020 16:16
To: gcc-patches@gcc.gnu.org
Cc: Richard Earnshaw ; Kyrylo Tkachov
; Ross Burton 
Subject: SLS Mitigation patches backported for GCC9

Hello,

Eventually we will want to backport the SLS patches to older branches.

When the GCC10 release is unfrozen we will work on getting the same
patches
already posted backported to that branch.  The patches already posted on
the
mailing list apply cleanly to the current releases/gcc-10 branch.

I've heard interest in having the GCC 9 patches, so I'm posting the modified
versions upstream sooner than otherwise.


I'd say let's go ahead with the GCC 10 patches (assuming testing works out well 
on there).
For the GCC 9 patches it would be useful if you included a bit of text of how 
they differ from the GCC 10/11 patches.
This would speed up the technical review.
Thanks,
Kyrill



Cheers,
Matthew

Entire patch series attached to cover letter.


Below were the only two "interesting" hunks that failed to apply after 
`patch -p1`.


The differences causing these were:
- in GCC-9 the `retab` instruction wasn't in the "do_return" pattern.
- `simple_return` had "aarch64_use_simple_return_insn_p ()" as a
   condition.




--- gcc/config/aarch64/aarch64.md
+++ gcc/config/aarch64/aarch64.md
@@ -863,18 +882,23 @@
   [(return)]
   ""
   {
+const char *ret = NULL;
 if (aarch64_return_address_signing_enabled ()
&& TARGET_ARMV8_3
&& !crtl->calls_eh_return)
   {
if (aarch64_ra_sign_key == AARCH64_KEY_B)
- return "retab";
+ ret = "retab";
else
- return "retaa";
+ ret = "retaa";
   }
-return "ret";
+else
+  ret = "ret";
+output_asm_insn (ret, operands);
+return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
   }
-  [(set_attr "type" "branch")]
+  [(set_attr "type" "branch")
+   (set_attr "sls_length" "retbr")]
 )

 (define_expand "return"
@@ -886,8 +910,12 @@
 (define_insn "simple_return"
   [(simple_return)]
   ""
-  "ret"
-  [(set_attr "type" "branch")]
+  {
+output_asm_insn ("ret", operands);
+return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
+  }
+  [(set_attr "type" "branch")
+   (set_attr "sls_length" "retbr")]
 )

 (define_insn "*cb1"


aarch64: (GCC-9 Backport) New Straight Line Speculation (SLS) mitigation flags

2020-07-21 Thread Matthew Malcomson
Here we introduce the flags that will be used for straight line speculation.

The new flag introduced is `-mharden-sls=`.
This flag can take arguments of `none`, `all`, or a comma seperated list
of one or more of `retbr` or `blr`.
`none` indicates no special mitigation of the straight line speculation
vulnerability.
`all` requests all mitigations currently implemented.
`retbr` requests that the RET and BR instructions have a speculation
barrier inserted after them.
`blr` requests that BLR instructions are replaced by a BL to a function
stub using a BR with a speculation barrier after it.

Setting this on a per-function basis using attributes or the like is not
enabled, but may be in the future.

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (aarch64_harden_sls_retbr_p):
New.
(aarch64_harden_sls_blr_p): New.
* config/aarch64/aarch64.c (enum aarch64_sls_hardening_type):
New.
(aarch64_harden_sls_retbr_p): New.
(aarch64_harden_sls_blr_p): New.
(aarch64_validate_sls_mitigation): New.
(aarch64_override_options): Parse options for SLS mitigation.
* config/aarch64/aarch64.opt (-mharden-sls): New option.
* doc/invoke.texi: Document new option.

(cherry picked from commit a9ba2a9b77bec7eacaf066801f22d1c366a2bc86)


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
af2a17f0bf3b50a306b14d8c0aa431269f54ef2e..db5a6a3a1813abbbeeaaa7009466ac58595e12bc
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -658,4 +658,7 @@ extern const atomic_ool_names aarch64_ool_ldset_names;
 extern const atomic_ool_names aarch64_ool_ldclr_names;
 extern const atomic_ool_names aarch64_ool_ldeor_names;
 
+extern bool aarch64_harden_sls_retbr_p (void);
+extern bool aarch64_harden_sls_blr_p (void);
+
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
1b6e67ccd53da05c88b13cac8b524ba2b8cfe43f..0903370b6128e19016f97cb6b0fd44e6b51e569e
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11791,6 +11791,79 @@ aarch64_validate_mcpu (const char *str, const struct 
processor **res,
   return false;
 }
 
+/* Straight line speculation indicators.  */
+enum aarch64_sls_hardening_type
+{
+  SLS_NONE = 0,
+  SLS_RETBR = 1,
+  SLS_BLR = 2,
+  SLS_ALL = 3,
+};
+static enum aarch64_sls_hardening_type aarch64_sls_hardening;
+
+/* Return whether we should mitigatate Straight Line Speculation for the RET
+   and BR instructions.  */
+bool
+aarch64_harden_sls_retbr_p (void)
+{
+  return aarch64_sls_hardening & SLS_RETBR;
+}
+
+/* Return whether we should mitigatate Straight Line Speculation for the BLR
+   instruction.  */
+bool
+aarch64_harden_sls_blr_p (void)
+{
+  return aarch64_sls_hardening & SLS_BLR;
+}
+
+/* As of yet we only allow setting these options globally, in the future we may
+   allow setting them per function.  */
+static void
+aarch64_validate_sls_mitigation (const char *const_str)
+{
+  char *token_save = NULL;
+  char *str = NULL;
+
+  if (strcmp (const_str, "none") == 0)
+{
+  aarch64_sls_hardening = SLS_NONE;
+  return;
+}
+  if (strcmp (const_str, "all") == 0)
+{
+  aarch64_sls_hardening = SLS_ALL;
+  return;
+}
+
+  char *str_root = xstrdup (const_str);
+  str = strtok_r (str_root, ",", _save);
+  if (!str)
+error ("invalid argument given to %<-mharden-sls=%>");
+
+  int temp = SLS_NONE;
+  while (str)
+{
+  if (strcmp (str, "blr") == 0)
+   temp |= SLS_BLR;
+  else if (strcmp (str, "retbr") == 0)
+   temp |= SLS_RETBR;
+  else if (strcmp (str, "none") == 0 || strcmp (str, "all") == 0)
+   {
+ error ("%<%s%> must be by itself for %<-mharden-sls=%>", str);
+ break;
+   }
+  else
+   {
+ error ("invalid argument %<%s%> for %<-mharden-sls=%>", str);
+ break;
+   }
+  str = strtok_r (NULL, ",", _save);
+}
+  aarch64_sls_hardening = (aarch64_sls_hardening_type) temp;
+  free (str_root);
+}
+
 /* Parses CONST_STR for branch protection features specified in
aarch64_branch_protect_types, and set any global variables required.  
Returns
the parsing result and assigns LAST_STR to the last processed token from
@@ -12029,6 +12102,9 @@ aarch64_override_options (void)
   selected_arch = NULL;
   selected_tune = NULL;
 
+  if (aarch64_harden_sls_string)
+aarch64_validate_sls_mitigation (aarch64_harden_sls_string);
+
   if (aarch64_branch_protection_string)
 aarch64_validate_mbranch_protection (aarch64_branch_protection_string);
 
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index 
f474a28eb9234cf87b22492c4396d5a31857cd39..4beb2d3d243c4ca36e7604c02ab5d9e80955f55a
 100644
--- a/gcc/config/aarch64/aarch64.opt
+++ 

aarch64: (GCC-9 Backport) Introduce SLS mitigation for RET and BR instructions

2020-07-21 Thread Matthew Malcomson
Instructions following RET or BR are not necessarily executed.  In order
to avoid speculation past RET and BR we can simply append a speculation
barrier.

Since these speculation barriers will not be architecturally executed,
they are not expected to add a high performance penalty.

The speculation barrier is to be SB when targeting architectures which
have this enabled, and DSB SY + ISB otherwise.

We add tests for each of the cases where such an instruction was seen.

This is implemented by modifying each machine description pattern that
emits either a RET or a BR instruction.  We choose not to use something
like `TARGET_ASM_FUNCTION_EPILOGUE` since it does not affect the
`indirect_jump`, `jump`, `sibcall_insn` and `sibcall_value_insn`
patterns and we find it preferable to implement the functionality in the
same way for every pattern.

There is one particular case which is slightly tricky.  The
implementation of TARGET_ASM_TRAMPOLINE_TEMPLATE uses a BR which needs
to be mitigated against.  The trampoline template is used *once* per
compilation unit, and the TRAMPOLINE_SIZE is exposed to the user via the
builtin macro __LIBGCC_TRAMPOLINE_SIZE__.
In the future we may implement function specific attributes to turn on
and off hardening on a per-function basis.
The fixed nature of the trampoline described above implies it will be
safer to ensure this speculation barrier is always used.

Testing:
  Bootstrap and regtest done on aarch64-none-linux
  Used a temporary hack(1) to use these options on every test in the
  testsuite and a script to check that the output never emitted an
  unmitigated RET or BR.

1) Temporary hack was a change to the testsuite to always use
`-save-temps` and run a script on the assembly output of those
compilations which produced one to ensure every RET or BR is immediately
followed by a speculation barrier.

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (aarch64_sls_barrier): New.
* config/aarch64/aarch64.c (aarch64_output_casesi): Emit
speculation barrier after BR instruction if needs be.
(aarch64_trampoline_init): Handle ptr_mode value & adjust size
of code copied.
(aarch64_sls_barrier): New.
(aarch64_asm_trampoline_template): Add needed barriers.
* config/aarch64/aarch64.h (AARCH64_ISA_SB): New.
(TARGET_SB): New.
(TRAMPOLINE_SIZE): Account for barrier.
* config/aarch64/aarch64.md (indirect_jump, *casesi_dispatch,
simple_return, *do_return, *sibcall_insn, *sibcall_value_insn):
Emit barrier if needs be, also account for possible barrier using
"sls_length" attribute.
(sls_length): New attribute.
(length): Determine default using any non-default sls_length
value.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sls-mitigation/sls-miti-retbr.c: New test.
* gcc.target/aarch64/sls-mitigation/sls-miti-retbr-pacret.c:
New test.
* gcc.target/aarch64/sls-mitigation/sls-mitigation.exp: New file.
* lib/target-supports.exp (check_effective_target_aarch64_asm_sb_ok):
New proc.

(cherry picked from be178ecd5ac1fe1510d960ff95c66d0ff831afe1)


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
db5a6a3a1813abbbeeaaa7009466ac58595e12bc..46196fe648e06f6b7543d8656cd6be6b39f8f774
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -658,6 +658,7 @@ extern const atomic_ool_names aarch64_ool_ldset_names;
 extern const atomic_ool_names aarch64_ool_ldclr_names;
 extern const atomic_ool_names aarch64_ool_ldeor_names;
 
+const char *aarch64_sls_barrier (int);
 extern bool aarch64_harden_sls_retbr_p (void);
 extern bool aarch64_harden_sls_blr_p (void);
 
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
0e9b79e23f05f34cfee5fdccf5792af077a95798..8dcc9b1cec347104ca3bbcc34a45b30fcf35a886
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -235,6 +235,7 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_ISA_F16FML(aarch64_isa_flags & AARCH64_FL_F16FML)
 #define AARCH64_ISA_RCPC8_4   (aarch64_isa_flags & AARCH64_FL_RCPC8_4)
 #define AARCH64_ISA_V8_5  (aarch64_isa_flags & AARCH64_FL_V8_5)
+#define AARCH64_ISA_SB(aarch64_isa_flags & AARCH64_FL_SB)
 
 /* Crypto is an optional extension to AdvSIMD.  */
 #define TARGET_CRYPTO (TARGET_SIMD && AARCH64_ISA_CRYPTO)
@@ -285,6 +286,9 @@ extern unsigned aarch64_architecture_version;
 #define TARGET_FIX_ERR_A53_835769_DEFAULT 1
 #endif
 
+/* SB instruction is enabled through +sb.  */
+#define TARGET_SB (AARCH64_ISA_SB)
+
 /* Apply the workaround for Cortex-A53 erratum 835769.  */
 #define TARGET_FIX_ERR_A53_835769  \
   ((aarch64_fix_a53_err835769 == 2)\
@@ -931,8 +935,10 @@ typedef struct
 
 #define RETURN_ADDR_RTX 

aarch64: (GCC-9 Backport) Mitigate SLS for BLR instruction

2020-07-21 Thread Matthew Malcomson
This patch introduces the mitigation for Straight Line Speculation past
the BLR instruction.

This mitigation replaces BLR instructions with a BL to a stub which uses
a BR to jump to the original value.  These function stubs are then
appended with a speculation barrier to ensure no straight line
speculation happens after these jumps.

When optimising for speed we use a set of stubs for each function since
this should help the branch predictor make more accurate predictions
about where a stub should branch.

When optimising for size we use one set of stubs for all functions.
This set of stubs can have human readable names, and we are using
`__call_indirect_x` for register x.

When BTI branch protection is enabled the BLR instruction can jump to a
`BTI c` instruction using any register, while the BR instruction can
only jump to a `BTI c` instruction using the x16 or x17 registers.
Hence, in order to ensure this transformation is safe we mov the value
of the original register into x16 and use x16 for the BR.

As an example when optimising for size:
a
BLR x0
instruction would get transformed to something like
BL __call_indirect_x0
where __call_indirect_x0 labels a thunk that contains
__call_indirect_x0:
MOV X16, X0
BR X16


The first version of this patch used local symbols specific to a
compilation unit to try and avoid relocations.
This was mistaken since functions coming from the same compilation unit
can still be in different sections, and the assembler will insert
relocations at jumps between sections.

On any relocation the linker is permitted to emit a veneer to handle
jumps between symbols that are very far apart.  The registers x16 and
x17 may be clobbered by these veneers.
Hence the function stubs cannot rely on the values of x16 and x17 being
the same as just before the function stub is called.

Similar can be said for the hot/cold partitioning of single functions,
so function-local stubs have the same restriction.

This updated version of the patch never emits function stubs for x16 and
x17, and instead forces other registers to be used.

Given the above, there is now no benefit to local symbols (since they
are not enough to avoid dealing with linker intricacies).  This patch
now uses global symbols with hidden visibility each stored in their own
COMDAT section.  This means stubs can be shared between compilation
units while still avoiding the PLT indirection.

This patch also removes the `__call_indirect_x30` stub (and
function-local equivalent) which would simply jump back to the original
location.

The function-local stubs are emitted to the assembly output file in one
chunk, which means we need not add the speculation barrier directly
after each one.
This is because we know for certain that the instructions directly after
the BR in all but the last function stub will be from another one of
these stubs and hence will not contain a speculation gadget.
Instead we add a speculation barrier at the end of the sequence of
stubs.

The global stubs are emitted in COMDAT/.linkonce sections by
themselves so that the linker can remove duplicates from multiple object
files.  This means they are not emitted in one chunk, and each one must
include the speculation barrier.

Another difference is that since the global stubs are shared across
compilation units we do not know that all functions will be targeting an
architecture supporting the SB instruction.
Rather than provide multiple stubs for each architecture, we provide a
stub that will work for all architectures -- using the DSB+ISB barrier.

This mitigation does not apply for BLR instructions in the following
places:
- Some accesses to thread-local variables use a code sequence with a BLR
  instruction.  This code sequence is part of the binary interface between
  compiler and linker. If this BLR instruction needs to be mitigated, it'd
  probably be best to do so in the linker. It seems that the code sequence
  for thread-local variable access is unlikely to lead to a Spectre Revalation
  Gadget.
- PLT stubs are produced by the linker and each contain a BLR instruction.
  It seems that at most only after the last PLT stub a Spectre Revalation
  Gadget might appear.

Testing:
  Bootstrap and regtest on AArch64
(with BOOT_CFLAGS="-mharden-sls=retbr,blr")
  Used a temporary hack(1) in gcc-dg.exp to use these options on every
  test in the testsuite, a slight modification to emit the speculation
  barrier after every function stub, and a script to check that the
  output never emitted a BLR, or unmitigated BR or RET instruction.
  Similar on an aarch64-none-elf cross-compiler.

1) Temporary hack emitted a speculation barrier at the end of every stub
function, and used a script to ensure that:
  a) Every RET or BR is immediately followed by a speculation barrier.
  b) No BLR instruction is emitted by compiler.

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (aarch64_indirect_call_asm):
New declaration.
* 

SLS Mitigation patches backported for GCC9

2020-07-21 Thread Matthew Malcomson
Hello,

Eventually we will want to backport the SLS patches to older branches.

When the GCC10 release is unfrozen we will work on getting the same patches
already posted backported to that branch.  The patches already posted on the
mailing list apply cleanly to the current releases/gcc-10 branch.

I've heard interest in having the GCC 9 patches, so I'm posting the modified
versions upstream sooner than otherwise.

Cheers,
Matthew

Entire patch series attached to cover letter.

all-patches.tar.gz
Description: application/gzip


Re: [Patch 2/3] aarch64: Introduce SLS mitigation for RET and BR instructions

2020-07-03 Thread Matthew Malcomson
With suggestions applied.
Testing with `-mabi=ilp32` found a bug around the trampoline
initialisation where the new larger size of the trampoline caused a
different execution path of `emit_block_move` which ICE'd on the
pre-existing `ptr_mode` address.


Commit Message
--

Instructions following RET or BR are not necessarily executed.  In order
to avoid speculation past RET and BR we can simply append a speculation
barrier.

Since these speculation barriers will not be architecturally executed,
they are not expected to add a high performance penalty.

The speculation barrier is to be SB when targeting architectures which
have this enabled, and DSB SY + ISB otherwise.

We add tests for each of the cases where such an instruction was seen.

This is implemented by modifying each machine description pattern that
emits either a RET or a BR instruction.  We choose not to use something
like `TARGET_ASM_FUNCTION_EPILOGUE` since it does not affect the
`indirect_jump`, `jump`, `sibcall_insn` and `sibcall_value_insn`
patterns and we find it preferable to implement the functionality in the
same way for every pattern.

There is one particular case which is slightly tricky.  The
implementation of TARGET_ASM_TRAMPOLINE_TEMPLATE uses a BR which needs
to be mitigated against.  The trampoline template is used *once* per
compilation unit, and the TRAMPOLINE_SIZE is exposed to the user via the
builtin macro __LIBGCC_TRAMPOLINE_SIZE__.
In the future we may implement function specific attributes to turn on
and off hardening on a per-function basis.
The fixed nature of the trampoline described above implies it will be
safer to ensure this speculation barrier is always used.

Testing:
  Bootstrap and regtest done on aarch64-none-linux
  Used a temporary hack(1) to use these options on every test in the
  testsuite and a script to check that the output never emitted an
  unmitigated RET or BR.


1) Temporary hack was a change to the testsuite to always use
`-save-temps` and run a script on the assembly output of those
compilations which produced one to ensure every RET or BR is immediately
followed by a speculation barrier.


gcc/ChangeLog:

2020-07-03  Matthew Malcomson  

* config/aarch64/aarch64-protos.h (aarch64_sls_barrier): New.
* config/aarch64/aarch64.c (aarch64_output_casesi): Emit
speculation barrier after BR instruction if needs be.
(aarch64_trampoline_init): Handle ptr_mode value & adjust size
of code copied.
(aarch64_sls_barrier): New.
(aarch64_asm_trampoline_template): Add needed barriers.
* config/aarch64/aarch64.h (AARCH64_ISA_SB): New.
(TARGET_SB): New.
(TRAMPOLINE_SIZE): Account for barrier.
* config/aarch64/aarch64.md (indirect_jump, *casesi_dispatch,
*do_return, simple_return, *sibcall_insn, *sibcall_value_insn):
Emit barrier if needs be, also account for possible barrier using
"sls_length" attribute.
(sls_length): New attribute.
(length): Determine default using any non-default sls_length
value.
* config/aarch64/aarch64.opt (-mharden-sls-retbr): Introduce new
option.

gcc/testsuite/ChangeLog:

2020-07-03  Matthew Malcomson  

* gcc.target/aarch64/sls-mitigation/sls-miti-retbr.c: New test.
* gcc.target/aarch64/sls-mitigation/sls-miti-retbr-pacret.c:
New test.
* gcc.target/aarch64/sls-mitigation/sls-mitigation.exp: New file.
* lib/target-supports.exp (check_effective_target_aarch64_asm_sb_ok):
New proc.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
8ca67d7e69edaf73c84f079e7e1c483009ad10c0..b035e4ec78e2ef1c9a931148dffacf6a50345b84
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -780,6 +780,7 @@ extern const atomic_ool_names aarch64_ool_ldeor_names;
 
 tree aarch64_resolve_overloaded_builtin_general (location_t, tree, void *);
 
+const char *aarch64_sls_barrier (int);
 extern bool aarch64_harden_sls_retbr_p (void);
 extern bool aarch64_harden_sls_blr_p (void);
 
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
2be52fd4d73def0007795159298e3d3e8fc4399d..d60d295830d3e422bb4267de275597d2087b99e6
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -281,6 +281,7 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_ISA_F32MM (aarch64_isa_flags & AARCH64_FL_F32MM)
 #define AARCH64_ISA_F64MM (aarch64_isa_flags & AARCH64_FL_F64MM)
 #define AARCH64_ISA_BF16  (aarch64_isa_flags & AARCH64_FL_BF16)
+#define AARCH64_ISA_SB(aarch64_isa_flags & AARCH64_FL_SB)
 
 /* Crypto is an optional extension to AdvSIMD.  */
 #define TARGET_CRYPTO (TARGET_SIMD && AARCH64_ISA_CRYPTO)
@@ -378,6 +379,9 @@ extern unsigned 

Re: [Patch 1/3] aarch64: New Straight Line Speculation (SLS) mitigation flags

2020-07-03 Thread Matthew Malcomson
With suggestions applied.


Here we introduce the flags that will be used for straight line speculation.

The new flag introduced is `-mharden-sls=`.
This flag can take arguments of `none`, `all`, or a comma seperated list of one
or more of `retbr` or `blr`.
`none` indicates no special mitigation of the straight line speculation
vulnerability.
`all` requests all mitigations currently implemented.
`retbr` requests that the RET and BR instructions have a speculation barrier
inserted after them.
`blr` requests that BLR instructions are replaced by a BL to a function stub
using a BR with a speculation barrier after it.

Setting this on a per-function basis using attributes or the like is not
enabled, but may be in the future.

gcc/ChangeLog:

2020-07-03  Matthew Malcomson  

* config/aarch64/aarch64-protos.h (aarch64_harden_sls_retbr_p):
New.
(aarch64_harden_sls_blr_p): New.
* config/aarch64/aarch64.c (enum aarch64_sls_hardening_type):
New.
(aarch64_harden_sls_retbr_p): New.
(aarch64_harden_sls_blr_p): New.
(aarch64_validate_sls_mitigation): New.
(aarch64_override_options): Parse options for SLS mitigation.
* config/aarch64/aarch64.opt (-mharden-sls): New option.
* doc/invoke.texi: Document new option.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
9e43adb7db0373df6cc5ef1d2b22f217aca2aad2..8ca67d7e69edaf73c84f079e7e1c483009ad10c0
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -780,4 +780,7 @@ extern const atomic_ool_names aarch64_ool_ldeor_names;
 
 tree aarch64_resolve_overloaded_builtin_general (location_t, tree, void *);
 
+extern bool aarch64_harden_sls_retbr_p (void);
+extern bool aarch64_harden_sls_blr_p (void);
+
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
f3551a73d87c4e686540f39224985592c3c66fd1..b1a7c10c4eaadd78eb45926c23efc51a8272b5fd
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -14502,6 +14502,80 @@ aarch64_validate_mcpu (const char *str, const struct 
processor **res,
   return false;
 }
 
+/* Straight line speculation indicators.  */
+enum aarch64_sls_hardening_type
+{
+  SLS_NONE = 0,
+  SLS_RETBR = 1,
+  SLS_BLR = 2,
+  SLS_ALL = 3,
+};
+static enum aarch64_sls_hardening_type aarch64_sls_hardening;
+
+/* Return whether we should mitigatate Straight Line Speculation for the RET
+   and BR instructions.  */
+bool
+aarch64_harden_sls_retbr_p (void)
+{
+  return aarch64_sls_hardening & SLS_RETBR;
+}
+
+/* Return whether we should mitigatate Straight Line Speculation for the BLR
+   instruction.  */
+bool
+aarch64_harden_sls_blr_p (void)
+{
+  return aarch64_sls_hardening & SLS_BLR;
+}
+
+/* As of yet we only allow setting these options globally, in the future we may
+   allow setting them per function.  */
+static void
+aarch64_validate_sls_mitigation (const char *const_str)
+{
+  char *token_save = NULL;
+  char *str = NULL;
+
+  aarch64_sls_hardening = SLS_NONE;
+  if (strcmp (const_str, "none") == 0)
+{
+  aarch64_sls_hardening = SLS_NONE;
+  return;
+}
+  if (strcmp (const_str, "all") == 0)
+{
+  aarch64_sls_hardening = SLS_ALL;
+  return;
+}
+
+  char *str_root = xstrdup (const_str);
+  str = strtok_r (str_root, ",", _save);
+  if (!str)
+error ("invalid argument given to %<-mharden-sls=%>");
+
+  int temp = SLS_NONE;
+  while (str)
+{
+  if (strcmp (str, "blr") == 0)
+   temp |= SLS_BLR;
+  else if (strcmp (str, "retbr") == 0)
+   temp |= SLS_RETBR;
+  else if (strcmp (str, "none") == 0 || strcmp (str, "all") == 0)
+   {
+ error ("%<%s%> must be by itself for %<-mharden-sls=%>", str);
+ break;
+   }
+  else
+   {
+ error ("invalid argument %<%s%> for %<-mharden-sls=%>", str);
+ break;
+   }
+  str = strtok_r (NULL, ",", _save);
+}
+  aarch64_sls_hardening = (aarch64_sls_hardening_type) temp;
+  free (str_root);
+}
+
 /* Parses CONST_STR for branch protection features specified in
aarch64_branch_protect_types, and set any global variables required.  
Returns
the parsing result and assigns LAST_STR to the last processed token from
@@ -14746,6 +14820,9 @@ aarch64_override_options (void)
   selected_arch = NULL;
   selected_tune = NULL;
 
+  if (aarch64_harden_sls_string)
+aarch64_validate_sls_mitigation (aarch64_harden_sls_string);
+
   if (aarch64_branch_protection_string)
 aarch64_validate_mbranch_protection (aarch64_branch_protection_string);
 
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index 
d99d14c137d8774d

Re: [Patch 1/3] aarch64: New Straight Line Speculation (SLS) mitigation flags

2020-06-23 Thread Matthew Malcomson

On 23/06/2020 16:48, Richard Sandiford wrote:

Matthew Malcomson  writes:

@@ -14466,6 +14466,81 @@ aarch64_validate_mcpu (const char *str, const struct 
processor **res,
return false;
  mfix-cortex-a53-835769
  Target Report Var(aarch64_fix_a53_err835769) Init(2) Save
  Workaround for ARM Cortex-A53 Erratum number 835769.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 
35e8242af5fa4c52744fd2c3e2cfee0a617e22bb..8a3fab2964c9bb06c820766d284768751d63ac9a
 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -696,6 +696,7 @@ Objective-C and Objective-C++ Dialects}.
  -msign-return-address=@var{scope} @gol
  -mbranch-protection=@var{none}|@var{standard}|@var{pac-ret}[+@var{leaf}
  +@var{b-key}]|@var{bti} @gol
+-mharden-sls=@var{none}|@var{all}|@var{retbr}|@var{blr} @gol
  -march=@var{name}  -mcpu=@var{name}  -mtune=@var{name}  @gol
  -moverride=@var{string}  -mverbose-cost-dump @gol
  -mstack-protector-guard=@var{guard} -mstack-protector-guard-reg=@var{sysreg} 
@gol
@@ -17045,6 +17046,15 @@ functions.  The optional argument @samp{b-key} can be 
used to sign the functions
  with the B-key instead of the A-key.
  @samp{bti} turns on branch target identification mechanism.
  
+@item -mharden-sls=@var{none}|@var{all}|@var{retbr}|@var{blr}

+@opindex mharden-sls
+Enable compiler hardening against straight line speculation (SLS).
+There are two options for hardening against straight line speculation.
+@samp{retbr} allows inserting speculation barriers after every
+@samp{br} and @samp{ret} instruction.  While @samp{blr} enables replacing
+@samp{blr} instructions with a @samp{bl} to a function stub.
+@samp{all} enables all SLS hardening, while @samp{none} does not enable any.


OK, so this is even more picky, sorry, but the syntax and description
imply to me that you can choose only one of the four options.  I think
it would be more accurate to say something like:

@item -mharden-sls=@var{opts}
@opindex mharden-sls
Enable compiler hardening against straight line speculation (SLS).
@var{opts} is a comma-separated list of the following options:
@table @samp
@item retbr
…
@item blr
…
@end table
In addition, @samp{-mharden-sls=all} enables all SLS hardening
while @samp{-mharden-sls=none} disables all SLS hardening.

(assuming the above behaviour change for “none”)

Thanks,
Richard



Another "just to check": the same change should be made in the short 
form right? (i.e. the hunk above is now `-mharden-sls=@var{opts}`)


Re: [Patch 2/3] aarch64: Introduce SLS mitigation for RET and BR instructions

2020-06-23 Thread Matthew Malcomson

On 23/06/2020 17:56, Richard Sandiford wrote:

Matthew Malcomson  writes:

On 23/06/2020 17:17, Richard Sandiford wrote:

Matthew Malcomson  writes:

--- a/gcc/config/aarch64/aarch64-protos.h
+/* Ensure there are no BR or RET instructions which are not directly followed
+   by a speculation barrier.  */
+/* { dg-final { scan-assembler-not 
"\t(br|ret|retaa|retab)\tx\[0-9\]\[0-9\]?\n\t(?!dsb\tsy\n\tisb|sb)" } } */


Isn't the “sb” alternative invalid given the -march option?

Probably slightly easier to read if the regexp is quoted using {…}
rather than "…".  Same for the other tests.



Just to check before I respin:  Using {} instead of "" means I need to
replace \t with a literal tab -- do you still prefer it?


Are you sure?  We've been using tests like:

/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.s, z[0-9]+\.s, 
z[0-9]+\.s\n} 1 } } */

for SVE without problems.  Using {…} means that backslash quoting
is applied by the regexp parser rather than the Tcl string parser,
but both should work for things like \t.

Richard



Ah -- my mistake -- I was just checking with `string compare` while 
making the change and didn't think too hard when I saw a -1.


Re: [Patch 2/3] aarch64: Introduce SLS mitigation for RET and BR instructions

2020-06-23 Thread Matthew Malcomson

On 23/06/2020 17:17, Richard Sandiford wrote:

Matthew Malcomson  writes:

--- a/gcc/config/aarch64/aarch64-protos.h
+/* Ensure there are no BR or RET instructions which are not directly followed
+   by a speculation barrier.  */
+/* { dg-final { scan-assembler-not 
"\t(br|ret|retaa|retab)\tx\[0-9\]\[0-9\]?\n\t(?!dsb\tsy\n\tisb|sb)" } } */


Isn't the “sb” alternative invalid given the -march option?

Probably slightly easier to read if the regexp is quoted using {…}
rather than "…".  Same for the other tests.



Just to check before I respin:  Using {} instead of "" means I need to 
replace \t with a literal tab -- do you still prefer it?


[Patch v2 3/3] aarch64: Mitigate SLS for BLR instruction

2020-06-23 Thread Matthew Malcomson
This patch introduces the mitigation for Straight Line Speculation past
the BLR instruction.

This mitigation replaces BLR instructions with a BL to a stub which uses
a BR to jump to the original value.  These function stubs are then
appended with a speculation barrier to ensure no straight line
speculation happens after these jumps.

When optimising for speed we use a set of stubs for each function since
this should help the branch predictor make more accurate predictions
about where a stub should branch.

When optimising for size we use one set of stubs for all functions.
This set of stubs can have human readable names, and we are using
`__call_indirect_x` for register x.

When BTI branch protection is enabled the BLR instruction can jump to a
`BTI c` instruction using any register, while the BR instruction can
only jump to a `BTI c` instruction using the x16 or x17 registers.
Hence, in order to ensure this transformation is safe we mov the value
of the original register into x16 and use x16 for the BR.

As an example when optimising for size:
a
BLR x0
instruction would get transformed to something like
BL __call_indirect_x0
where __call_indirect_x0 labels a thunk that contains
__call_indirect_x0:
MOV X16, X0
BR X16



The first version of this patch used local symbols specific to a
compilation unit to try and avoid relocations.
This was mistaken since functions coming from the same compilation unit
can still be in different sections, and the assembler will insert
relocations at jumps between sections.

On any relocation the linker is permitted to emit a veneer to handle
jumps between symbols that are very far apart.  The registers x16 and
x17 may be clobbered by these veneers.
Hence the function stubs cannot rely on the values of x16 and x17 being
the same as just before the function stub is called.

Similar can be said for the hot/cold partitioning of single functions,
so function-local stubs have the same restriction.

This updated version of the patch never emits function stubs for x16 and
x17, and instead forces other registers to be used.


Given the above, there is now no benefit to local symbols (since they
are not enough to avoid dealing with linker intricacies).  This patch
now uses global symbols with hidden visibility each stored in their own
COMDAT section.  This means stubs can be shared between compilation
units while still avoiding the PLT indirection.


This patch also removes the `__call_indirect_x30` stub (and
function-local equivalent) which would simply jump back to the original
location.


The function-local stubs are emitted to the assembly output file in one
chunk, which means we need not add the speculation barrier directly
after each one.
This is because we know for certain that the instructions directly after
the BR in all but the last function stub will be from another one of
these stubs and hence will not contain a speculation gadget.
Instead we add a speculation barrier at the end of the sequence of
stubs.

The global stubs are emitted in COMDAT/.linkonce sections by
themselves so that the linker can remove duplicates from multiple object
files.  This means they are not emitted in one chunk, and each one must
include the speculation barrier.

Another difference is that since the global stubs are shared across
compilation units we do not know that all functions will be targeting an
architecture supporting the SB instruction.
Rather than provide multiple stubs for each architecture, we provide a
stub that will work for all architectures -- using the DSB+ISB barrier.


This mitigation does not apply for BLR instructions in the following
places:
- Some accesses to thread-local variables use a code sequence with a BLR
  instruction.  This code sequence is part of the binary interface between
  compiler and linker. If this BLR instruction needs to be mitigated, it'd
  probably be best to do so in the linker. It seems that the code sequence
  for thread-local variable access is unlikely to lead to a Spectre Revalation
  Gadget.
- PLT stubs are produced by the linker and each contain a BLR instruction.
  It seems that at most only after the last PLT stub a Spectre Revalation
  Gadget might appear.

Testing:
  Bootstrap and regtest on AArch64
(with BOOT_CFLAGS="-mharden-sls=retbr,blr")
  Used a temporary hack(1) in gcc-dg.exp to use these options on every
  test in the testsuite, a slight modification to emit the speculation
  barrier after every function stub, and a script to check that the
  output never emitted a BLR, or unmitigated BR or RET instruction.
  Similar on an aarch64-none-elf cross-compiler.

1) Temporary hack emitted a speculation barrier at the end of every stub
function, and used a script to ensure that:
  a) Every RET or BR is immediately followed by a speculation barrier.
  b) No BLR instruction is emitted by compiler.


gcc/ChangeLog:

2020-06-23  Matthew Malcomson  

* config/aarch64/aarch64-protos.h (aarch64_indirec

[Patch 2/3] aarch64: Introduce SLS mitigation for RET and BR instructions

2020-06-08 Thread Matthew Malcomson
Instructions following RET or BR are not necessarily executed.  In order
to avoid speculation past RET and BR we can simply append a speculation
barrier.

Since these speculation barriers will not be architecturally executed,
they are not expected to add a high performance penalty.

The speculation barrier is to be SB when targeting architectures which
have this enabled, and DSB SY + ISB otherwise.

We add tests for each of the cases where such an instruction was seen.

This is implemented by modifying each machine description pattern that
emits either a RET or a BR instruction.  We choose not to use something
like `TARGET_ASM_FUNCTION_EPILOGUE` since it does not affect the
`indirect_jump`, `jump`, `sibcall_insn` and `sibcall_value_insn`
patterns and we find it preferable to implement the functionality in the
same way for every pattern.

There is one particular case which is slightly tricky.  The
implementation of TARGET_ASM_TRAMPOLINE_TEMPLATE uses a BR which needs
to be mitigated against.  The trampoline template is used *once* per
compilation unit, and the TRAMPOLINE_SIZE is exposed to the user via the
builtin macro __LIBGCC_TRAMPOLINE_SIZE__.
In the future we may implement function specific attributes to turn on
and off hardening on a per-function basis.
The fixed nature of the trampoline described above implies it will be
safer to ensure this speculation barrier is always used.

Testing:
  Bootstrap and regtest done on aarch64-none-linux
  Used a temporary hack(1) to use these options on every test in the
  testsuite and a script to check that the output never emitted an
  unmitigated RET or BR.


1) Temporary hack was a change to the testsuite to always use
`-save-temps` and run a script on the assembly output of those
compilations which produced one to ensure every RET or BR is immediately
followed by a speculation barrier.


gcc/ChangeLog:

2020-06-08  Matthew Malcomson  

* config/aarch64/aarch64-protos.h (aarch64_sls_barrier): New.
* config/aarch64/aarch64.c (aarch64_output_casesi): Emit
speculation barrier after BR instruction if needs be.
(aarch64_sls_barrier): New.
(aarch64_asm_trampoline_template): Add needed barriers.
* config/aarch64/aarch64.h (AARCH64_ISA_SB): New.
(TARGET_SB): New.
(TRAMPOLINE_SIZE): Account for barrier.
* config/aarch64/aarch64.md (indirect_jump, *casesi_dispatch,
*do_return, simple_return, *sibcall_insn, *sibcall_value_insn):
Emit barrier if needs be, also account for possible barrier in
"length" attribute.
* config/aarch64/aarch64.opt (-mharden-sls-retbr): Introduce new
option.

gcc/testsuite/ChangeLog:

2020-06-08  Matthew Malcomson  

* gcc.target/aarch64/sls-mitigation/sls-miti-retbr.c: New test.
* gcc.target/aarch64/sls-mitigation/sls-miti-retbr-pacret.c:
New test.
* gcc.target/aarch64/sls-mitigation/sls-mitigation.exp: New file.
* lib/target-supports.exp (check_effective_target_aarch64_asm_sb_ok):
New proc.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
8ca67d7e69edaf73c84f079e7e1c483009ad10c0..d2eb739bc89ecd9d0212416b8dc3ee4ba236a271
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -780,6 +780,7 @@ extern const atomic_ool_names aarch64_ool_ldeor_names;
 
 tree aarch64_resolve_overloaded_builtin_general (location_t, tree, void *);
 
+const char * aarch64_sls_barrier (int);
 extern bool aarch64_harden_sls_retbr_p (void);
 extern bool aarch64_harden_sls_blr_p (void);
 
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
24767c747bab0d711627c5c646937c42f210d70b..5da3d94e335fc315e1d90e6a674f2f09cf1a4529
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -281,6 +281,7 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_ISA_F32MM (aarch64_isa_flags & AARCH64_FL_F32MM)
 #define AARCH64_ISA_F64MM (aarch64_isa_flags & AARCH64_FL_F64MM)
 #define AARCH64_ISA_BF16  (aarch64_isa_flags & AARCH64_FL_BF16)
+#define AARCH64_ISA_SB(aarch64_isa_flags & AARCH64_FL_SB)
 
 /* Crypto is an optional extension to AdvSIMD.  */
 #define TARGET_CRYPTO (TARGET_SIMD && AARCH64_ISA_CRYPTO)
@@ -378,6 +379,9 @@ extern unsigned aarch64_architecture_version;
 #define TARGET_FIX_ERR_A53_835769_DEFAULT 1
 #endif
 
+/* SB instruction is enabled through +sb.  */
+#define TARGET_SB (AARCH64_ISA_SB)
+
 /* Apply the workaround for Cortex-A53 erratum 835769.  */
 #define TARGET_FIX_ERR_A53_835769  \
   ((aarch64_fix_a53_err835769 == 2)\
@@ -1058,8 +1062,11 @@ typedef struct
 
 #define RETURN_ADDR_RTX aarch64_return_addr
 
-/* BTI c + 3 insns + 2 pointer-sized entries.  */
-#define TRAMPOLINE_SIZE(TARGET_ILP32 ? 24 :

[Patch 3/3] aarch64: Mitigate SLS for BLR instruction

2020-06-08 Thread Matthew Malcomson
This patch introduces the mitigation for Straight Line Speculation past
the BLR instruction.

This mitigation replaces BLR instructions with a BL to a stub which
simply consists of a BR to the original register.  These function stubs
are then appended with a speculation barrier to ensure no straight line
speculation happens after these jumps.

When optimising for speed we use a set of stubs for each function since
this should help the branch predictor make more accurate predictions
about where a stub should branch.

When optimising for size we use one set of stubs for the entire
compilation unit.
This set of stubs can have human readable names, and we are currently
using `__call_indirect_x` for register x.

As an example when optimising for size:
a
BLR x0
instruction would get transformed to
BL __call_indirect_x0
with __call_indirect_x0 labelling a thunk that contains
__call_indirect_x0:
BR X0
speculation barrier

Since we add these function stubs to the assembly output all in one
chunk, we need not add the speculation barrier directly after each one.
This is because we know for certain that the instructions directly after
the BR in all but the last function stub will be from another one of
these stubs and hence will not contain a speculation gadget.
Instead we add a speculation barrier at the end of the sequence of
stubs.

Special care needs to be given to this transformation occuring in
a context where BTI is enabled.  A BLR can jump to a `BTI c` target,
while a BR can only jump to a `BTI c` target if it uses the registers
x16 or x17.
Hence we use constraints to limit the registers used when this
transformation is being made in an environment that uses BTI.

This mitigation does not apply for BLR instructions in the following
places:
- Some accesses to thread-local variables use a code sequence with a BLR
  instruction.  This code sequence is part of the binary interface between
  compiler and linker. If this BLR instruction needs to be mitigated, it'd
  probably be best to do so in the linker. It seems that the code sequence
  for thread-local variable access is unlikely to lead to a Spectre Revalation
  Gadget.
- PLT stubs are produced by the linker and each contain a BLR instruction.
  It seems that at most only after the last PLT stub a Spectre Revalation
  Gadget might appear.

Testing:
  Bootstrap and regtest on AArch64
(with BOOT_CFLAGS="-mharden-sls=retbr,blr")
  Used a temporary hack(1) in gcc-dg.exp to use these options on every
  test in the testsuite, a slight modification to emit the speculation
  barrier after every function stub, and a script to check that the
  output never emitted a BLR, or unmitigated BR or RET instruction.
  Similar on an aarch64-none-elf cross-compiler.

1) Temporary hack emitted a speculation barrier at the end of every stub
function, and used a script to ensure that:
  a) Every RET or BR is immediately followed by a speculation barrier.
  b) No BLR instruction is emitted by compiler.


gcc/ChangeLog:

2020-06-08  Matthew Malcomson  

* config/aarch64/aarch64-protos.h (aarch64_indirect_call_asm):
New declaration.
* config/aarch64/aarch64.c (aarch64_use_return_insn_p): Return
false if hardening BLR instructions.
(aarch64_sls_shared_thunks): Global array to store stub labels.
(aarch64_create_blr_label): New.
(print_asm_branch): New macro.
(aarch64_sls_emit_blr_function_thunks): New.
(aarch64_sls_emit_shared_blr_thunks): New.
(aarch64_asm_file_end): New.
(aarch64_indirect_call_asm): New.
(TARGET_ASM_FILE_END): Use aarch64_asm_file_end.
(TARGET_ASM_FUNCTION_EPILOGUE): Use
aarch64_sls_emit_blr_function_thunks.
* config/aarch64/aarch64.h (struct machine_function): Introduce
`call_via` array to store function-local stub labels.
* config/aarch64/aarch64.md (*call_insn, *call_value_insn): Use
aarch64_indirect_call_asm to emit code when hardening BLR
instructions.

gcc/testsuite/ChangeLog:

2020-06-08  Matthew Malcomson  

* gcc.target/aarch64/sls-mitigation/sls-miti-blr-bti.c: New test.
* gcc.target/aarch64/sls-mitigation/sls-miti-blr.c: New test.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
d2eb739bc89ecd9d0212416b8dc3ee4ba236a271..e79f9cbc783e75132e999395ff975f9768436419
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -781,6 +781,7 @@ extern const atomic_ool_names aarch64_ool_ldeor_names;
 tree aarch64_resolve_overloaded_builtin_general (location_t, tree, void *);
 
 const char * aarch64_sls_barrier (int);
+const char * aarch64_indirect_call_asm (rtx);
 extern bool aarch64_harden_sls_retbr_p (void);
 extern bool aarch64_harden_sls_blr_p (void);
 
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/conf

[Patch 1/3] aarch64: New Straight Line Speculation (SLS) mitigation flags

2020-06-08 Thread Matthew Malcomson
Here we introduce the flags that will be used for straight line speculation.

The new flag introduced is `-mharden-sls=`.
This flag can take arguments of `none`, `all`, or a comma seperated list of one
or more of `retbr` or `blr`.
`none` indicates no special mitigation of the straight line speculation
vulnerability.
`all` requests all mitigations currently implemented.
`retbr` requests that the RET and BR instructions have a speculation barrier
inserted after them.
`blr` requests that BLR instructions are replaced by a BL to a function stub
using a BR with a speculation barrier after it.

Setting this on a per-function basis using attributes or the like is not
enabled, but may be in the future.

gcc/ChangeLog:

2020-06-08  Matthew Malcomson  

* config/aarch64/aarch64-protos.h (aarch64_harden_sls_retbr_p):
New.
(aarch64_harden_sls_blr_p): New.
* config/aarch64/aarch64.c (enum aarch64_sls_hardening_type):
New.
(aarch64_harden_sls_retbr_p): New.
(aarch64_harden_sls_blr_p): New.
(aarch64_validate_sls_mitigation): New.
(aarch64_override_options): Parse options for SLS mitigation.
* config/aarch64/aarch64.opt (-mharden-sls): New option.
* doc/invoke.texi: Document new option.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
9e43adb7db0373df6cc5ef1d2b22f217aca2aad2..8ca67d7e69edaf73c84f079e7e1c483009ad10c0
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -780,4 +780,7 @@ extern const atomic_ool_names aarch64_ool_ldeor_names;
 
 tree aarch64_resolve_overloaded_builtin_general (location_t, tree, void *);
 
+extern bool aarch64_harden_sls_retbr_p (void);
+extern bool aarch64_harden_sls_blr_p (void);
+
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
e92c7e69fcb7a8689a8b7098b86ff050dc9ab78b..775f49991e5f599a843d3ef490b8cd044acfe78f
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -14466,6 +14466,81 @@ aarch64_validate_mcpu (const char *str, const struct 
processor **res,
   return false;
 }
 
+
+/* Straight line speculation indicators.  */
+enum aarch64_sls_hardening_type
+{
+SLS_NONE = 0,
+SLS_RETBR = 1,
+SLS_BLR = 2,
+SLS_ALL = 3,
+};
+static enum aarch64_sls_hardening_type aarch64_sls_hardening;
+/* Return whether we should mitigatate Straight Line Speculation for the RET
+   and BR instructions.  */
+bool
+aarch64_harden_sls_retbr_p (void)
+{
+  return aarch64_sls_hardening & SLS_RETBR;
+}
+/* Return whether we should mitigatate Straight Line Speculation for the RET
+   and BR instructions.  */
+bool
+aarch64_harden_sls_blr_p (void)
+{
+  return aarch64_sls_hardening & SLS_BLR;
+}
+
+/* As of yet we only allow setting these options globally, in the future we may
+   allow setting them per function.  */
+static void
+aarch64_validate_sls_mitigation (const char *const_str)
+{
+  char *str_root = xstrdup (const_str);
+  char *token_save = NULL;
+  char *str = NULL;
+  int temp = SLS_NONE;
+
+  aarch64_sls_hardening = SLS_NONE;
+  if (strcmp (str_root, "none") == 0)
+goto finish;
+  if (strcmp (str_root, "all") == 0)
+{
+  aarch64_sls_hardening = SLS_ALL;
+  goto finish;
+}
+
+  str = strtok_r (str_root, ",", _save);
+  if (!str)
+{
+  error ("invalid argument given to %<-mharden-sls=%>");
+  goto finish;
+}
+
+  while (str)
+{
+  if (strcmp (str, "blr") == 0)
+   temp |= SLS_BLR;
+  else if (strcmp (str, "retbr") == 0)
+   temp |= SLS_RETBR;
+  else if (strcmp (str, "none") == 0 || strcmp (str, "all") == 0)
+   {
+ error ("%<%s%> must be by itself for %<-mharden-sls=%>", str);
+ break;
+   }
+  else
+   {
+ error ("invalid argument %<%s%> for %<-mharden-sls=%>", str);
+ break;
+   }
+  str = strtok_r (NULL, ",", _save);
+}
+  aarch64_sls_hardening = (aarch64_sls_hardening_type) temp;
+finish:
+  free (str_root);
+  return;
+}
+
 /* Parses CONST_STR for branch protection features specified in
aarch64_branch_protect_types, and set any global variables required.  
Returns
the parsing result and assigns LAST_STR to the last processed token from
@@ -14710,6 +14785,9 @@ aarch64_override_options (void)
   selected_arch = NULL;
   selected_tune = NULL;
 
+  if (aarch64_harden_sls_string)
+  aarch64_validate_sls_mitigation (aarch64_harden_sls_string);
+
   if (aarch64_branch_protection_string)
 aarch64_validate_mbranch_protection (aarch64_branch_protection_string);
 
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index 
d99d14c137d8774d

Straight Line Speculation (SLS) mitigation.

2020-06-08 Thread Matthew Malcomson
Hi,

A new speculative cache side-channel vulnerability has been published at
the link below, named "straight-line speculation" (SLS in this patch series).
https://developer.arm.com/support/arm-security-updates/speculative-processor-vulnerability/downloads/straight-line-speculation
This vulnerability has been given CVE number CVE-2020-13844.

We have prepared some toolchain mitigations for this vulnerability.  These
mitigate against the RET, BR case and the BLR case mentioned in the linked
whitepaper.

The part of vulnerability relevant to these toolchain mitigations is as
follows:
Some processors may speculatively execute the instructions immediately
following what should be a change in control flow.  The examples we mitigate in
this patch series are the instructions RET (return), BR (indirect jump) and BLR
(indirect function call).
Where the speculative path contains a suitable code sequence, often described
by researchers as a "Spectre Revelation Gadget", such straight-line speculation
could lead to changes in the caches and similar structures that are indicative
of secrets, making those secrets vulnerable to revelation through timing
analysis.

The gist of the mitigation posted here is:

Every RET and BR instruction has a speculation barrier placed directly after
it.  These speculation barriers are not to be architecturally executed, so the
performance cost is expected to be low.

Each BLR instruction is replaced by a BL to a function stub consisting of a BR
instruction followed by a speculation barrier.
This alternate approach is used since the instructions directly after a BLR are
usually architecturally executed, and this approach ensures the speculation
barrier is off that architecturally executed path.
Arm has been unable to demonstrate straight line speculation past a BL or B
instruction, and so we believe the BL instruction can be used without a
barrier.

In summary, a
  RET
will be transformed to
  RET
  

While a
  BLR x
will be transformed to a
  BL __call_indirect_x
call, with __call_indirect_x being a thunk that looks like
__call_indirect_x:
  BR x
  


The patch series is structured as follows:
1) Introduce new command line arguments.
2) The RET/BR mitigation.
3) The BLR mitigation.

There are a few known places where this toolchain mitigation does not protect
against speculation places:
- Some accesses to thread-local variables use a code sequence including a BLR
  instruction.   This code sequence is part of the binary interface between
  compiler and linker. If this BLR instruction needs to be mitigated, it'd
  probably be best to do so in the linker.
  It seems that the code sequence for thread-local variable access is unlikely
  to lead to a Spectre Revalation Gadget.
- PLT stubs are produced by the linker, and each contain a BLR instruction.
  It seems that at most this could introduce one spectre relevation gadget
  after the last PLT stub.
- Use of BR, RET, or BLR instructions in assembly are not mitigated.
- Use of BR, RET, or BLR instructions in libraries and run-time library
  routines that are not recompiled with this toolchain mitigation are not
  mitigated.

N.b. patches with similar functionality are being posted to LLVM.

Thanks,
Matthew.


Entire patch series attached to cover letter.

all-patches.tar.gz
Description: application/gzip


[Arm] Account for C++17 artificial field determining Homogeneous Aggregates

2020-04-27 Thread Matthew Malcomson
NOTE:
  There is another patch in the making to handle the APCS abi (selected with
  `-mabi=apcs-gnu`).  That patch has a very different change in behaviour and
  is in a different part of the code so I'm keeping it in a different patch.

In C++14, an empty class deriving from an empty base is not an
aggregate, while in C++17 it is.  In order to implement this, GCC adds
an artificial field to such classes.

This artificial field has no mapping to Fundamental Data Types in the
Arm PCS ABI and hence should not count towards determining whether an
object can be passed using the vector registers as per section
"7.1.2 Procedure Calling" in the arm PCS
https://developer.arm.com/docs/ihi0042/latest?_ga=2.60211309.1506853196.1533541889-405231439.1528186050

This patch avoids counting this artificial field in
aapcs_vfp_sub_candidate, and hence calculates whether such objects
should be passed in vector registers in the same manner as C++14 (where
the artificial field does not exist).

Before this change, the test below would pass the arguments to `f` in
general registers.  After this change, the test passes the arguments to
`f` using the vector registers.

The new behaviour matches the behaviour of `armclang`, and also matches
the GCC behaviour when run with `-std=gnu++14`.

> gcc -std=gnu++17 -march=armv8-a+simd -mfloat-abi=hard test.cpp

``` test.cpp
struct base {};

struct pair : base
{
  float first;
  float second;
  pair (float f, float s) : first(f), second(s) {}
};

void f (pair);
int main()
{
  f({3.14, 666});
  return 1;
}
```

We add a `-Wpsabi` warning to catch cases where this fix has changed the ABI for
some functions.  Unfortunately this warning is not emitted twice for multiple
calls to the same function, but I feel this is not much of a problem and can be
fixed later if needs be.

(i.e. if `main` called `f` twice in a row we only emit a diagnostic for the
first).

Testing:
Bootstrapped and regression tested on arm-linux.
This change fixes the struct-layout-1 tests Jakub added
https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544204.html
Regression tested on arm-none-eabi.

gcc/ChangeLog:

2020-04-27  Matthew Malcomson  
Jakub Jelinek  

PR target/94383
* config/arm/arm.c (aapcs_vfp_sub_candidate): Account for C++17 empty
base class artificial fields.
(aapcs_vfp_is_call_or_return_candidate): Warn when PCS ABI
decision is different after this fix.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
0151bda90d961ae1a001c61cd5e94d6ec67e3aea..0db444c3fdb2ba6fb7ebad410310fb5b1bc0e304
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -6139,9 +6139,19 @@ aapcs_vfp_cum_init (CUMULATIVE_ARGS *pcum  
ATTRIBUTE_UNUSED,
If *MODEP is VOIDmode, then set it to the first valid floating point
type.  If a non-floating point type is found, or if a floating point
type that doesn't match a non-VOIDmode *MODEP is found, then return -1,
-   otherwise return the count in the sub-tree.  */
+   otherwise return the count in the sub-tree.
+
+   The AVOID_CXX17_EMPTY_BASE argument is to allow the caller to check whether
+   this function has changed its behavior after the fix for PR94384 -- this fix
+   is to avoid artificial fields in empty base classes.
+   When called with this argument as a NULL pointer this function does not
+   avoid the artificial fields -- this is useful to check whether the function
+   returns something different after the fix.
+   When called pointing at a value, this function avoids such artificial fields
+   and sets the value to TRUE when one of these fields has been set.  */
 static int
-aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep)
+aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep,
+bool *avoid_cxx17_empty_base)
 {
   machine_mode mode;
   HOST_WIDE_INT size;
@@ -6213,7 +6223,8 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode 
*modep)
|| TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
  return -1;
 
-   count = aapcs_vfp_sub_candidate (TREE_TYPE (type), modep);
+   count = aapcs_vfp_sub_candidate (TREE_TYPE (type), modep,
+avoid_cxx17_empty_base);
if (count == -1
|| !index
|| !TYPE_MAX_VALUE (index)
@@ -6251,7 +6262,18 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode 
*modep)
if (TREE_CODE (field) != FIELD_DECL)
  continue;
 
-   sub_count = aapcs_vfp_sub_candidate (TREE_TYPE (field), modep);
+   /* Ignore C++17 empty base fields, while their type indicates
+  they do contain padding, they have zero size and thus don't
+  contain any padding.  */
+   if (cxx17_empty_base_field_p (field)
+   && a

[AArch64] (PR94383) Avoid C++17 empty base field checking for HVA/HFA

2020-04-23 Thread Matthew Malcomson
In C++17, an empty class deriving from an empty base is not an
aggregate, while in C++14 it is.  In order to implement this, GCC adds
an artificial field to such classes.

This artificial field has no mapping to Fundamental Data Types in the
AArch64 PCS ABI and hence should not count towards determining whether an
object can be passed using the vector registers as per section
"6.4.2 Parameter Passing Rules" in the AArch64 PCS.
https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#the-base-procedure-call-standard

This patch avoids counting this artificial field in
aapcs_vfp_sub_candidate, and hence calculates whether such objects
should be passed in vector registers in the same manner as C++14 (where
the artificial field does not exist).

Before this change, the test below would pass the arguments to `f` in
general registers.  After this change, the test passes the arguments to
`f` using the vector registers.

The new behaviour matches the behaviour of `armclang`, and also matches
the behaviour when run with `-std=gnu++14`.

> gcc -std=gnu++17 test.cpp

``` test.cpp
struct base {};

struct pair : base
{
  float first;
  float second;
  pair (float f, float s) : first(f), second(s) {}
};

void f (pair);
int main()
{
  f({3.14, 666});
  return 1;
}
```

We add a `-Wpsabi` warning to catch cases where this fix has changed the ABI for
some functions.  Unfortunately this warning is not emitted twice for multiple
calls to the same function, but I feel this is not much of a problem and can be
fixed later if needs be.

(i.e. if `main` called `f` twice in a row we only emit a diagnostic for the
first).

Testing:
Minimal testing done just to demonstrate new warning message and to check
tests related to this still pass.


gcc/ChangeLog:

2020-04-23  Matthew Malcomson  
Jakub Jelinek  

PR target/94383
* config/aarch64/aarch64.c (aapcs_vfp_sub_candidate): Account for C++17
empty base class artificial fields.
(aarch64_vfp_is_call_or_return_candidate): Warn when ABI PCS decision is
different after this fix.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
f728ac530a4ac968de16ea59d2b724ed63b23d6f..25f60ec44ae3a177d0aceb4bae92ea17da91ba05
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -16442,9 +16442,19 @@ aarch64_member_type_forces_blk (const_tree 
field_or_array, machine_mode mode)
If *MODEP is VOIDmode, then set it to the first valid floating point
type.  If a non-floating point type is found, or if a floating point
type that doesn't match a non-VOIDmode *MODEP is found, then return -1,
-   otherwise return the count in the sub-tree.  */
+   otherwise return the count in the sub-tree.
+
+   The AVOID_CXX17_EMPTY_BASE argument is to allow the caller to check whether
+   this function has changed its behavior after the fix for PR94384 -- this fix
+   is to avoid artificial fields in empty base classes.
+   When called with this argument as a NULL pointer this function does not
+   avoid the artificial fields -- this is useful to check whether the function
+   returns something different after the fix.
+   When called pointing at a value, this function avoids such artificial fields
+   and sets the value to TRUE when one of these fields has been set.  */
 static int
-aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep)
+aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep,
+bool *avoid_cxx17_empty_base)
 {
   machine_mode mode;
   HOST_WIDE_INT size;
@@ -16520,7 +16530,8 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode 
*modep)
|| TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
  return -1;
 
-   count = aapcs_vfp_sub_candidate (TREE_TYPE (type), modep);
+   count = aapcs_vfp_sub_candidate (TREE_TYPE (type), modep,
+avoid_cxx17_empty_base);
if (count == -1
|| !index
|| !TYPE_MAX_VALUE (index)
@@ -16558,7 +16569,18 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode 
*modep)
if (TREE_CODE (field) != FIELD_DECL)
  continue;
 
-   sub_count = aapcs_vfp_sub_candidate (TREE_TYPE (field), modep);
+   /* Ignore C++17 empty base fields, while their type indicates
+  they do contain padding, they have zero size and thus don't
+  contain any padding.  */
+   if (cxx17_empty_base_field_p (field)
+   && avoid_cxx17_empty_base)
+ {
+   *avoid_cxx17_empty_base = true;
+   continue;
+ }
+
+   sub_count = aapcs_vfp_sub_candidate (TREE_TYPE (field), modep,
+avoid_cxx17_empty_base);
if (sub_count < 0)
  retu

[AArch64] (PR94383) Avoid C++17 empty base field checking for HVA/HFA

2020-04-21 Thread Matthew Malcomson
In C++17, an empty class deriving from an empty base is not an
aggregate, while in C++14 it is.  In order to implement this, GCC adds
an artificial field to such classes.

This artificial field has no mapping to Fundamental Data Types in the
AArch64 PCS ABI and hence should not count towards determining whether an
object can be passed using the vector registers as per section
"6.4.2 Parameter Passing Rules" in the AArch64 PCS.
https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#the-base-procedure-call-standard

This patch avoids counting this artificial field in
aapcs_vfp_sub_candidate, and hence calculates whether such objects
should be passed in vector registers in the same manner as C++14 (where
the artificial field does not exist).

Before this change, the test below would pass the arguments to `f` in
general registers.  After this change, the test passes the arguments to
`f` using the vector registers.

The new behaviour matches the behaviour of `armclang`, and also matches
the behaviour when run with `-std=gnu++14`.

> gcc -std=gnu++17 test.cpp

``` test.cpp
struct base {};

struct pair : base
{
  float first;
  float second;
  pair (float f, float s) : first(f), second(s) {}
};

void f (pair);
int main()
{
  f({3.14, 666});
  return 1;
}
```

We add a `-Wpsabi` warning to catch cases where this fix has changed the ABI for
some functions.  Unfortunately this warning is currently emitted multiple times
for each problem, but I feel this is not much of a problem and can be fixed
later if needs be.


Testing:
Bootstrapped on aarch64-linux.
Jakub has provided a testsuite change that catches the original problem.
https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544204.html
This patch makes that testcase pass.

gcc/ChangeLog:

2020-04-21  Matthew Malcomson  
Jakub Jelinek  

PR target/94383
* config/aarch64/aarch64.c (enum cpp17empty_state): New.
(aapcs_vfp_sub_candidate): Account for C++17 empty base class
artificial fields.
(aarch64_vfp_is_call_or_return_candidate): Warn when ABI PCS decision is
different after this fix.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
c90de65de127992cc0bd9603a32684ebb5bd4fdf..f9f40d52c846afae0d8d2cd9ed4737cdc4b88896
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -15909,13 +15909,30 @@ aarch64_conditional_register_usage (void)
 }
 }
 
+enum cpp17empty_state {
+DONT_AVOID = 0,
+AVOID = 1,
+AVOID_DONE = 3
+};
+
 /* Walk down the type tree of TYPE counting consecutive base elements.
If *MODEP is VOIDmode, then set it to the first valid floating point
type.  If a non-floating point type is found, or if a floating point
type that doesn't match a non-VOIDmode *MODEP is found, then return -1,
-   otherwise return the count in the sub-tree.  */
+   otherwise return the count in the sub-tree.
+
+   The AVOID_C17EMPTY_FIELD argument is to allow the caller to check whether
+   this function has changed its behaviour after the fix for PR94384 -- this
+   fix is to avoid artificial fields in empty base classes.
+   When called pointing at a value of DONT_AVOID then this function does not
+   avoid the artificial fields -- this is useful to check whether the function
+   returns something different after the fix.
+   When called pointing at a value which has the AVOID bit set, this function
+   avoids such artificial fields and sets the value to AVOID_DONE when one of
+   these fields has been set.  */
 static int
-aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep)
+aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep,
+enum cpp17empty_state *avoid_c17empty_field)
 {
   machine_mode mode;
   HOST_WIDE_INT size;
@@ -15992,7 +16009,8 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode 
*modep)
|| TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
  return -1;
 
-   count = aapcs_vfp_sub_candidate (TREE_TYPE (type), modep);
+   count = aapcs_vfp_sub_candidate (TREE_TYPE (type), modep,
+avoid_c17empty_field);
if (count == -1
|| !index
|| !TYPE_MAX_VALUE (index)
@@ -16030,7 +16048,22 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode 
*modep)
if (TREE_CODE (field) != FIELD_DECL)
  continue;
 
-   sub_count = aapcs_vfp_sub_candidate (TREE_TYPE (field), modep);
+   /* Ignore C++17 empty base fields, while their type indicates
+  they do contain padding, they have zero size and thus don't
+  contain any padding.  */
+   if (DECL_ARTIFICIAL (field)
+   && DECL_NAME (field) == NULL_TREE
+   && RECORD_OR_UNION_TYPE_P (TREE_TYPE (field))
+ 

Re: [PATCH 1/2] testsuite: [arm/cde] Include arm_cde.h and arm_mve.h in arm_v8*m_main_cde*

2020-04-20 Thread Matthew Malcomson

On 20/04/2020 08:47, Christophe Lyon via Gcc-patches wrote:

Since arm_cde.h includes stdint.h, its use requires the presence of
the right gnu/stub-*.h, so make sure to include it when checking the
arm_v8*m_main_cde* effective targets, otherwise we can decide CDE is
supported while it's not really (all tests that use arm_v8m_main_cde*
also include arm_cde.h aynway).

Similarly for the effective targets that also require MVE.

This makes several tests unsupported rather than fail.


Hi Christophe,

This looks like a good idea to me -- though I'm not a maintainer so that 
means little ;-)


I just wanted to ask -- is this needed mainly if testing without a fully 
functional C99 environment?
I would imagine that a fully set up environment would have some sort of 
`stdint.h` available, even if not using the glibc gnu/stub-*.h files 
(e.g. using the newlib headers).


Cheers,
Matthew


---
  gcc/testsuite/lib/target-supports.exp | 10 --
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index d16498d..23a5abf 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -5140,21 +5140,25 @@ proc add_options_for_arm_v8_2a_bf16_neon { flags } {
  #   /* { dg-add-options arm_v8m_main_cde } */
  # The tests are valid for Arm.
  
-foreach { armfunc armflag armdef } {

+foreach { armfunc armflag armdef arminc } {
arm_v8m_main_cde
"-march=armv8-m.main+cdecp0+cdecp6 -mthumb"
"defined (__ARM_FEATURE_CDE)"
+   ""
arm_v8m_main_cde_fp
"-march=armv8-m.main+fp+cdecp0+cdecp6 -mthumb -mfpu=auto"
"defined (__ARM_FEATURE_CDE) && defined (__ARM_FP)"
+   ""
arm_v8_1m_main_cde_mve
"-march=armv8.1-m.main+mve+cdecp0+cdecp6 -mthumb -mfpu=auto"
"defined (__ARM_FEATURE_CDE) && defined (__ARM_FEATURE_MVE)"
+   "#include "
arm_v8_1m_main_cde_mve_fp
"-march=armv8.1-m.main+mve.fp+cdecp0+cdecp6 -mthumb -mfpu=auto"
"defined (__ARM_FEATURE_CDE) || __ARM_FEATURE_MVE == 3"
+   "#include "
} {
-eval [string map [list FUNC $armfunc FLAG $armflag DEF $armdef ] {
+eval [string map [list FUNC $armfunc FLAG $armflag DEF $armdef INC $arminc 
] {
proc check_effective_target_FUNC_ok_nocache { } {
global et_FUNC_flags
set et_FUNC_flags ""
@@ -5167,6 +5171,8 @@ foreach { armfunc armflag armdef } {
#if !(DEF)
#error "DEF failed"
#endif
+   #include 
+   INC
} "FLAG"] } {
set et_FUNC_flags "FLAG"
return 1





[Arm] Disallow arm_movdi when targetting MVE

2020-04-15 Thread Matthew Malcomson
Without disabling this, the pattern can try and move DImode values
between floating point registers and general registers.
The constraints on this pattern can't handle that, and reload goes into
an infinite loop.

This was the cause of a testsuite failure in cde_v_1_mve.c, which is now
gone.

A DImode move for MVE now uses the `movdi_vfp` pattern, which is the same
pattern used for such a move when MVE is not available but the target has
TARGET_HARD_FLOAT.

Testing done:
Bootstrapped and regtested on arm-none-linux-gnueabihf
regtested on arm-none-eabi

gcc/ChangeLog:

2020-04-15  Matthew Malcomson  

* config/arm/arm.md (arm_movdi): Disallow for MVE.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
7bc55cce61b2e45e5875a233dd4546d59399d749..a6a31f8f4ef8d3c8c96c61b450dbd7c28d08b55c
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -6233,6 +6233,7 @@
(match_operand:DI 1 "di_operand"  "rDa,Db,Dc,mi,r"))]
   "TARGET_32BIT
&& !(TARGET_HARD_FLOAT)
+   && !(TARGET_HAVE_MVE || TARGET_HAVE_MVE_FLOAT)
&& !TARGET_IWMMXT
&& (   register_operand (operands[0], DImode)
|| register_operand (operands[1], DImode))"

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
7bc55cce61b2e45e5875a233dd4546d59399d749..a6a31f8f4ef8d3c8c96c61b450dbd7c28d08b55c
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -6233,6 +6233,7 @@
(match_operand:DI 1 "di_operand"  "rDa,Db,Dc,mi,r"))]
   "TARGET_32BIT
&& !(TARGET_HARD_FLOAT)
+   && !(TARGET_HAVE_MVE || TARGET_HAVE_MVE_FLOAT)
&& !TARGET_IWMMXT
&& (   register_operand (operands[0], DImode)
|| register_operand (operands[1], DImode))"



[Arm] Allow the use of arm_cde.h for C++

2020-04-09 Thread Matthew Malcomson
arm_cde.h includes the arm_mve_types.h header, which declares some C++
overloaded functions.

There is a superfluous `extern "C"` statement in arm_cde.h, which
encompasses these functions.  This means that if compiling for C++, the
overloaded functions are declared, but are declared without name
mangling.  Hence all the function names are the same and we have many
conflicting declarations.

Testing Done:
  Regression tested for arm-none-eabi.

gcc/ChangeLog:

2020-04-09  Matthew Malcomson  

* config/arm/arm_cde.h: Remove `extern "C"` when compiling for
C++.

gcc/testsuite/ChangeLog:

2020-04-09  Matthew Malcomson  

* g++.target/arm/cde_mve.C: New test.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm_cde.h b/gcc/config/arm/arm_cde.h
index 
d8ddda6bd648d8b94e97f7b6403b7708cebc9eb3..0ba3ee02d057dd280c9b3400ef70b46ed472aec3
 100644
--- a/gcc/config/arm/arm_cde.h
+++ b/gcc/config/arm/arm_cde.h
@@ -27,10 +27,6 @@
 #ifndef _GCC_ARM_CDE_H
 #define _GCC_ARM_CDE_H 1
 
-#ifdef __cplusplus
-extern "C" {
-#endif
-
 #include 
 
 #if defined (__ARM_FEATURE_CDE)
@@ -177,8 +173,4 @@ extern "C" {
 
 #endif
 
-#ifdef __cplusplus
-}
-#endif
-
 #endif
diff --git a/gcc/testsuite/g++.target/arm/cde_mve.C 
b/gcc/testsuite/g++.target/arm/cde_mve.C
new file mode 100644
index 
..897cbd2b8119471385c1c51158b1aa4851acbaf1
--- /dev/null
+++ b/gcc/testsuite/g++.target/arm/cde_mve.C
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_main_cde_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_main_cde_mve_fp } */
+
+/* Ensure this compiles.  */
+#include "arm_cde.h"
+int foo ()
+{
+  return 1;
+}

diff --git a/gcc/config/arm/arm_cde.h b/gcc/config/arm/arm_cde.h
index 
d8ddda6bd648d8b94e97f7b6403b7708cebc9eb3..0ba3ee02d057dd280c9b3400ef70b46ed472aec3
 100644
--- a/gcc/config/arm/arm_cde.h
+++ b/gcc/config/arm/arm_cde.h
@@ -27,10 +27,6 @@
 #ifndef _GCC_ARM_CDE_H
 #define _GCC_ARM_CDE_H 1
 
-#ifdef __cplusplus
-extern "C" {
-#endif
-
 #include 
 
 #if defined (__ARM_FEATURE_CDE)
@@ -177,8 +173,4 @@ extern "C" {
 
 #endif
 
-#ifdef __cplusplus
-}
-#endif
-
 #endif
diff --git a/gcc/testsuite/g++.target/arm/cde_mve.C 
b/gcc/testsuite/g++.target/arm/cde_mve.C
new file mode 100644
index 
..897cbd2b8119471385c1c51158b1aa4851acbaf1
--- /dev/null
+++ b/gcc/testsuite/g++.target/arm/cde_mve.C
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_main_cde_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_main_cde_mve_fp } */
+
+/* Ensure this compiles.  */
+#include "arm_cde.h"
+int foo ()
+{
+  return 1;
+}



[wwwdocs] [arm] Document CDE intrinsics from ACLE

2020-04-08 Thread Matthew Malcomson
New in GCC 10.


### Attachment also inlined for ease of reply###


diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html
index 
3d8e0ba989e860d307310378a2be99b32a27261f..389561d13c69b650528e8ed8859ebbb5760438c6
 100644
--- a/htdocs/gcc-10/changes.html
+++ b/htdocs/gcc-10/changes.html
@@ -624,6 +624,9 @@ a work-in-progress.
   added: this M-profile feature is no longer restricted to targets
   with MOVT. For example, -mcpu=cortex-m0
   now supports this option.
+  Support for the Custom Datapath Extension ACLE
+  https://developer.arm.com/docs/101028/0010/custom-datapath-extension;>
+  intrinsics has been added.
 
 
 AVR

diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html
index 
3d8e0ba989e860d307310378a2be99b32a27261f..389561d13c69b650528e8ed8859ebbb5760438c6
 100644
--- a/htdocs/gcc-10/changes.html
+++ b/htdocs/gcc-10/changes.html
@@ -624,6 +624,9 @@ a work-in-progress.
   added: this M-profile feature is no longer restricted to targets
   with MOVT. For example, -mcpu=cortex-m0
   now supports this option.
+  Support for the Custom Datapath Extension ACLE
+  https://developer.arm.com/docs/101028/0010/custom-datapath-extension;>
+  intrinsics has been added.
 
 
 AVR



[PATCH] [Arm] Implement scalar Custom Datapath Extension intrinsics

2020-04-08 Thread Matthew Malcomson
Suggestions applied.  I didn't update the indentation since it seems good to me
(with the `(unspec:SIDI ...)` indented one more level than the SET as
suggested).  I'm guessing it looked strange since that line is indented with a
tab and that can look different in different text editors.


This patch introduces the scalar CDE (Custom Datapath Extension)
intrinsics for the arm backend.

There is nothing beyond the standard in this patch.  We simply build upon what
has been done by Dennis for the vector intrinsics.

We do add `+cdecp6` to the default arguments for `target-supports.exp`, this
allows for using coprocessor 6 in tests. This patch uses an alternate
coprocessor to ease assembler scanning by looking for a use of coprocessor 6.

We also ensure that any DImode registers are put in an even-odd register pair
when compiling for a target with CDE -- this avoids faulty code generation for
-Os when producing the cx*d instructions.

Testing done:
Bootstrapped and regtested for arm-none-linux-gnueabihf.

gcc/ChangeLog:

2020-04-08  Matthew Malcomson  

* config/arm/arm.c (arm_hard_regno_mode_ok): DImode registers forced
into even-odd register pairs for TARGET_CDE.
* config/arm/arm.h (ARM_CCDE_CONST_1): New.
(ARM_CCDE_CONST_2): New.
(ARM_CCDE_CONST_3): New.
* config/arm/arm.md (arm_cx1si, arm_cx1di arm_cx1asi, arm_cx1adi
arm_cx2si, arm_cx2di arm_cx2asi, arm_cx2adi arm_cx3si, arm_cx3di
arm_cx3asi, arm_cx3adi): New patterns.
* config/arm/arm_cde.h (__arm_cx1, __arm_cx1a, __arm_cx2, __arm_cx2a,
__arm_cx3, __arm_cx3a, __arm_cx1d, __arm_cx1da, __arm_cx2d, __arm_cx2da,
__arm_cx3d, __arm_cx3da): New ACLE function macros.
* config/arm/arm_cde_builtins.def (cx1, cx1a, cx2, cx2a, cx3, cx3a):
Define intrinsics.
* config/arm/iterators.md (cde_suffix, cde_dest): New mode attributes.
* config/arm/predicates.md (const_int_ccde1_operand,
const_int_ccde2_operand, const_int_ccde3_operand): New.
* config/arm/unspecs.md (UNSPEC_CDE, UNSPEC_CDEA): New.

gcc/testsuite/ChangeLog:

2020-04-08  Matthew Malcomson  

* gcc.target/arm/acle/cde-errors.c: New test.
* gcc.target/arm/acle/cde.c: New test.
* lib/target-supports.exp: Update CDE flags to enable coprocessor 6.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
ca36a74cd1fa161c388961588fa0f96030b7888e..83886a2fcb3844f6a5060e451125a6cd2d505c5c
 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -576,6 +576,9 @@ extern int arm_arch_cde;
 extern int arm_arch_cde_coproc;
 extern const int arm_arch_cde_coproc_bits[];
 #define ARM_CDE_CONST_COPROC   7
+#define ARM_CCDE_CONST_1   ((1 << 13) - 1)
+#define ARM_CCDE_CONST_2   ((1 << 9 ) - 1)
+#define ARM_CCDE_CONST_3   ((1 << 6 ) - 1)
 #define ARM_VCDE_CONST_1   ((1 << 11) - 1)
 #define ARM_VCDE_CONST_2   ((1 << 6 ) - 1)
 #define ARM_VCDE_CONST_3   ((1 << 3 ) - 1)
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
da0bfbc35501ba40324a38ee9ebc194f43196837..be076e4ac59be7f224b769bbca4013a554b50c07
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -25057,10 +25057,11 @@ arm_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
   if (ARM_NUM_REGS (mode) > 4)
return false;
 
-  if (TARGET_THUMB2 && !TARGET_HAVE_MVE)
+  if (TARGET_THUMB2 && !(TARGET_HAVE_MVE || TARGET_CDE))
return true;
 
-  return !(TARGET_LDRD && GET_MODE_SIZE (mode) > 4 && (regno & 1) != 0);
+  return !((TARGET_LDRD || TARGET_CDE)
+  && GET_MODE_SIZE (mode) > 4 && (regno & 1) != 0);
 }
 
   if (regno == FRAME_POINTER_REGNUM
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
6d5560398dae3d0ace0342b4907542d2a6865f70..9c4d66f4efe70d9ab8896865cbf45285e5cfbaf9
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -4408,6 +4408,70 @@
(set_attr "shift" "3")
(set_attr "type" "logic_shift_reg")])
 


+;; Custom Datapath Extension insns.
+(define_insn "arm_cx1"
+   [(set (match_operand:SIDI 0 "s_register_operand" "=r")
+(unspec:SIDI [(match_operand:SI 1 "const_int_coproc_operand" "i")
+  (match_operand:SI 2 "const_int_ccde1_operand" "i")]
+   UNSPEC_CDE))]
+   "TARGET_CDE"
+   "cx1\\tp%c1, , %2"
+)
+
+(define_insn "arm_cx1a"
+   [(set (match_operand:SIDI 0 "s_register_operand" "=r")
+(unspec:SIDI [(match_operand:SI 1 "const_int_coproc_operand" "i")
+  (match_operand:SIDI 2 "s_register_operand" "0")
+ 

[PATCH] [Arm] Implement scalar Custom Datapath Extension intrinsics

2020-04-08 Thread Matthew Malcomson
Patch rebased onto recent trunk.


This patch introduces the scalar CDE (Custom Datapath Extension)
intrinsics for the arm backend.

There is nothing beyond the standard in this patch.  We simply build upon what
has been done by Dennis for the vector intrinsics.

We do add `+cdecp6` to the default arguments for `target-supports.exp`, this
allows for using coprocessor 6 in tests. This patch uses an alternate
coprocessor to ease assembler scanning by looking for a use of coprocessor 6.

We also ensure that any DImode registers are put in an even-odd register pair
when compiling for a target with CDE -- this avoids faulty code generation for 
-Os
when producing the cx*d instructions.

Testing done:
Bootstrapped and regtested for arm-none-linux-gnueabihf.

gcc/ChangeLog:

2020-04-08  Matthew Malcomson  

* config/arm/arm.c (arm_hard_regno_mode_ok): DImode registers forced 
into
even-odd register pairs for TARGET_CDE.
* config/arm/arm.h (ARM_CCDE_CONST_1): New.
(ARM_CCDE_CONST_2): New.
(ARM_CCDE_CONST_3): New.
* config/arm/arm.md (arm_cx1si, arm_cx1di arm_cx1asi, arm_cx1adi 
arm_cx2si,
arm_cx2di arm_cx2asi, arm_cx2adi arm_cx3si, arm_cx3di arm_cx3asi,
arm_cx3adi): New patterns.
* config/arm/arm_cde.h (__arm_cx1, __arm_cx1a, __arm_cx2, __arm_cx2a,
__arm_cx3, __arm_cx3a, __arm_cx1d, __arm_cx1da, __arm_cx2d, __arm_cx2da,
__arm_cx3d, __arm_cx3da): New ACLE function macros.
* config/arm/arm_cde_builtins.def (cx1, cx1a, cx2, cx2a, cx3, cx3a): 
Define
intrinsics.
* config/arm/iterators.md (cde_suffix, cde_dest): New mode attributes.
* config/arm/predicates.md (const_int_ccde1_operand,
const_int_ccde2_operand, const_int_ccde3_operand): New.
* config/arm/unspecs.md (UNSPEC_CDE, UNSPEC_CDEA): New.

gcc/testsuite/ChangeLog:

2020-04-08  Matthew Malcomson  

* gcc.target/arm/acle/cde-errors.c: New test.
* gcc.target/arm/acle/cde.c: New test.
* lib/target-supports.exp: Update CDE flags to enable coprocessor 6.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
ca36a74cd1fa161c388961588fa0f96030b7888e..83886a2fcb3844f6a5060e451125a6cd2d505c5c
 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -576,6 +576,9 @@ extern int arm_arch_cde;
 extern int arm_arch_cde_coproc;
 extern const int arm_arch_cde_coproc_bits[];
 #define ARM_CDE_CONST_COPROC   7
+#define ARM_CCDE_CONST_1   ((1 << 13) - 1)
+#define ARM_CCDE_CONST_2   ((1 << 9 ) - 1)
+#define ARM_CCDE_CONST_3   ((1 << 6 ) - 1)
 #define ARM_VCDE_CONST_1   ((1 << 11) - 1)
 #define ARM_VCDE_CONST_2   ((1 << 6 ) - 1)
 #define ARM_VCDE_CONST_3   ((1 << 3 ) - 1)
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
da0bfbc35501ba40324a38ee9ebc194f43196837..be076e4ac59be7f224b769bbca4013a554b50c07
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -25057,10 +25057,11 @@ arm_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
   if (ARM_NUM_REGS (mode) > 4)
return false;
 
-  if (TARGET_THUMB2 && !TARGET_HAVE_MVE)
+  if (TARGET_THUMB2 && !(TARGET_HAVE_MVE || TARGET_CDE))
return true;
 
-  return !(TARGET_LDRD && GET_MODE_SIZE (mode) > 4 && (regno & 1) != 0);
+  return !((TARGET_LDRD || TARGET_CDE)
+  && GET_MODE_SIZE (mode) > 4 && (regno & 1) != 0);
 }
 
   if (regno == FRAME_POINTER_REGNUM
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
6d5560398dae3d0ace0342b4907542d2a6865f70..9c4d66f4efe70d9ab8896865cbf45285e5cfbaf9
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -4408,6 +4408,70 @@
(set_attr "shift" "3")
(set_attr "type" "logic_shift_reg")])
 


+;; Custom Datapath Extension insns.
+(define_insn "arm_cx1"
+   [(set (match_operand:SIDI 0 "s_register_operand" "=r")
+(unspec:SIDI [(match_operand:SI 1 "const_int_coproc_operand" "i")
+  (match_operand:SI 2 "const_int_ccde1_operand" "i")]
+   UNSPEC_CDE))]
+   "TARGET_CDE"
+   "cx1\\tp%c1, , %2"
+)
+
+(define_insn "arm_cx1a"
+   [(set (match_operand:SIDI 0 "s_register_operand" "=r")
+(unspec:SIDI [(match_operand:SI 1 "const_int_coproc_operand" "i")
+  (match_operand:SIDI 2 "s_register_operand" "0")
+  (match_operand:SI 3 "const_int_ccde1_operand" "i")]
+   UNSPEC_CDEA))]
+   "TARGET_CDE"
+   "cx1a\\tp%c1, , %3"
+)
+
+(define_insn "arm_cx2"
+   [(set (match_operand:SIDI 0 &quo

[Arm] Implement CDE predicated intrinsics for MVE registers

2020-04-08 Thread Matthew Malcomson
This is an update of the previous patch but rebased onto recent MVE patches.
https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543414.html

These intrinsics are the predicated version of the intrinsics inroduced
in https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543527.html.

These are not yet public on developer.arm.com but we have reached
internal consensus on them.

The approach follows the same method as for the CDE intrinsics for MVE
registers, most notably using the same arm_resolve_overloaded_builtin
function with minor modifications.

The resolver hook has been moved from arm-builtins.c to arm-c.c so it
can access the c-common function build_function_call_vec.  This function
is needed to perform the same checks on arguments as a normal C or C++
function would perform.
It is fine to put this resolver in arm-c.c since it's only use is for
the ACLE functions, and these are only available in C/C++.
So that the resolver function has access to information it needs from
the builtins, we put two query functions into arm-builtins.c and use
them from arm-c.c.

We rely on the order that the builtins are defined in
gcc/config/arm/arm_cde_builtins.def, knowing that the predicated
versions come after the non-predicated versions.

The machine description patterns for these builtins are simpler than
those for the non-predicated versions, since the accumulator versions
*and* non-accumulator versions both need an input vector now.
The input vector is needed for the non-accumulator version to describe
the original values for those lanes that are not updated during the
merge operation.

We additionally need to introduce qualifiers for these new builtins,
which follow the same pattern as the non-predicated versions but with an
extra argument to describe the predicate.

Error message changes:
- We directly mention the builtin argument when complaining that an
  argument is not in the correct range.
  This more closely matches the C error messages.
- We ensure the resolver complains about *all* invalid arguments to a
  function instead of just the first one.
- The resolver error messages index arguments from 1 instead of 0 to
  match the arguments coming from the C/C++ frontend.

In order to allow the user to give an argument for the merging predicate
when they don't care what data is stored in the 'false' lanes, we also
move the __arm_vuninitializedq* intrinsics from arm_mve.h to
arm_mve_types.h which is shared with arm_cde.h.

We only move the fully type-specified `__arm_vuninitializedq*`
intrinsics and not the polymorphic versions, since moving the
polymorphic versions requires moving the _Generic framework as well as
just the intrinsics we're interested in.  This matches the approach taken
for the `__arm_vreinterpret*` functions in this include file.

This patch also contains a slight change in spacing of an existing
assembly instruction to be emitted.
This is just to help writing tests -- vmsr usually has a tab and a space
between the mnemonic and the first argument, but in one case it just has
a tab -- making all the same helps make test regexps simpler.

Testing Done:
Bootstrap and full regtest on arm-none-linux-gnueabihf
Full regtest on arm-none-eabi

gcc/ChangeLog:

2020-04-08  Matthew Malcomson  

* config/arm/arm-builtins.c (CX_UNARY_UNONE_QUALIFIERS): New.
(CX_BINARY_UNONE_QUALIFIERS): New.
(CX_TERNARY_UNONE_QUALIFIERS): New.
(arm_resolve_overloaded_builtin): Move to arm-c.c.
(arm_expand_builtin_args): Update error message.
(enum resolver_ident): New.
(arm_describe_resolver): New.
(arm_cde_end_args): New.
* config/arm/arm-builtins.h: New file.
* config/arm/arm-c.c (arm_resolve_overloaded_builtin): New.
(arm_resolve_cde_builtin): Moved from arm-builtins.c.
* config/arm/arm_cde.h (__arm_vcx1q_m, __arm_vcx1qa_m,
__arm_vcx2q_m, __arm_vcx2qa_m, __arm_vcx3q_m, __arm_vcx3qa_m):
New.
* config/arm/arm_cde_builtins.def (vcx1q_p_, vcx1qa_p_,
vcx2q_p_, vcx2qa_p_, vcx3q_p_, vcx3qa_p_): New builtin defs.
* config/arm/iterators.md (CDE_VCX): New int iterator.
(a) New int attribute.
* config/arm/mve.md (arm_vcx1q_p_v16qi, arm_vcx2q_p_v16qi,
arm_vcx3q_p_v16qi): New patterns.
* config/arm/vfp.md (thumb2_movhi_fp16): Extra space in assembly.

gcc/testsuite/ChangeLog:

2020-04-08  Matthew Malcomson  

* gcc.target/arm/acle/cde-errors.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-error-1.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-error-2.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-error-3.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-full-assembly.c: Add predicated
forms.
* gcc.target/arm/acle/cde-mve-tests.c: Add predicated forms.
* gcc.target/arm/acle/cde_v_1_err.c (test_imm_range): Update for
error message format change

[Arm] Implement CDE intrinsics for MVE registers.

2020-04-08 Thread Matthew Malcomson

This patch updates the previous one by rebasing onto the recent MVE patches on
trunk.

Implement CDE intrinsics on MVE registers.

Other than the basics required for adding intrinsics this patch consists
of three changes.

** We separate out the MVE types and casts from the arm_mve.h header.

This is so that the types can be used in arm_cde.h without the need to include
the entire arm_mve.h header.
The only type that arm_cde.h needs is `uint8x16_t`, so this separation could be
avoided by using a `typedef` in this file.
Since the introduced intrinsics are all defined to act on the full range of MVE
types, declaring all such types seems intuitive since it will provide their
declaration to the user too.

This arm_mve_types.h header not only includes the MVE types, but also
the conversion intrinsics between them.
Some of the conversion intrinsics are needed for arm_cde.h, but most are
not.  We include all conversion intrinsics to keep the definition of
such conversion functions all in one place, on the understanding that
extra conversion functions being defined when including `arm_cde.h` is
not a problem.

** We define the TARGET_RESOLVE_OVERLOADED_BUILTIN hook for the Arm backend.

This is needed to implement the polymorphism for the required intrinsics.
The intrinsics have no specialised version, and the resulting assembly
instruction for all different types should be exactly the same.
Due to this we have implemented these intrinsics via one builtin on one type.
All other calls to the intrinsic with different types are implicitly cast to
the one type that is defined, and hence are all expanded to the same RTL
pattern that is only defined for one machine mode.

** We seperate the initialisation of the CDE intrinsics from others.

This allows us to ensure that the CDE intrinsics acting on MVE registers
are only created when both CDE and MVE are available.
Only initialising these builtins when both features are available is
especially important since they require a type that is only initialised
when the target supports hard float.  Hence trying to initialise these
builtins on a soft float target would cause an ICE.

Testing done:
  Full bootstrap and regtest on arm-none-linux-gnueabihf
  Regression test on arm-none-eabi


NOTE: This patch depends on the CDE intrinsic framework patch by Dennis.
https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542008.html


Ok for trunk?

gcc/ChangeLog:

2020-04-08  Matthew Malcomson  

* config.gcc (arm_mve_types.h): New extra_header for arm.
* config/arm/arm-builtins.c (arm_resolve_overloaded_builtin): New.
(arm_init_cde_builtins): New.
(arm_init_acle_builtins): Remove initialisation of CDE builtins.
(arm_init_builtins): Call arm_init_cde_builtins when target
supports CDE.
* config/arm/arm-c.c (arm_resolve_overloaded_builtin): New declaration.
(arm_register_target_pragmas): Initialise resolve_overloaded_builtin
hook to the implementation for the arm backend.
* config/arm/arm.h (ARM_MVE_CDE_CONST_1): New.
(ARM_MVE_CDE_CONST_2): New.
(ARM_MVE_CDE_CONST_3): New.
* config/arm/arm_cde.h (__arm_vcx1q_u8): New.
(__arm_vcx1qa): New.
(__arm_vcx2q): New.
(__arm_vcx2q_u8): New.
(__arm_vcx2qa): New.
(__arm_vcx3q): New.
(__arm_vcx3q_u8): New.
(__arm_vcx3qa): New.
* config/arm/arm_cde_builtins.def (vcx1q, vcx1qa, vcx2q, vcx2qa, vcx3q,
vcx3qa): New builtins defined.
* config/arm/arm_mve.h: Move typedefs and conversion intrinsics
to arm_mve_types.h header.
* config/arm/arm_mve_types.h: New file.
* config/arm/mve.md (arm_vcx1qv16qi, arm_vcx1qav16qi, arm_vcx2qv16qi,
arm_vcx2qav16qi, arm_vcx3qv16qi, arm_vcx3qav16qi): New patterns.
* config/arm/predicates.md (const_int_mve_cde1_operand,
const_int_mve_cde2_operand, const_int_mve_cde3_operand): New.

gcc/testsuite/ChangeLog:

2020-04-08  Matthew Malcomson  
Dennis Zhang  

* gcc.target/arm/acle/cde-mve-error-1.c: New test.
* gcc.target/arm/acle/cde-mve-error-2.c: New test.
* gcc.target/arm/acle/cde-mve-error-3.c: New test.
* gcc.target/arm/acle/cde-mve-full-assembly.c: New test.
* gcc.target/arm/acle/cde-mve-tests.c: New test.
* lib/target-supports.exp (arm_v8_1m_main_cde_mve_fp): New check
effective.


cde-mve-intrinsics.patch.gz
Description: application/gzip


[Arm] Implement CDE predicated intrinsics for MVE registers

2020-04-07 Thread Matthew Malcomson
These intrinsics are the predicated version of the intrinsics inroduced
in https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542725.html.

These are not yet public on developer.arm.com but we have reached
internal consensus on them.

The approach follows the same method as for the CDE intrinsics for MVE
registers, most notably using the same arm_resolve_overloaded_builtin
function with minor modifications.

The resolver hook has been moved from arm-builtins.c to arm-c.c so it
can access the c-common function build_function_call_vec.  This function
is needed to perform the same checks on arguments as a normal C or C++
function would perform.
It is fine to put this resolver in arm-c.c since it's only use is for
the ACLE functions, and these are only available in C/C++.
So that the resolver function has access to information it needs from
the builtins, we put two query functions into arm-builtins.c and use
them from arm-c.c.

We rely on the order that the builtins are defined in
gcc/config/arm/arm_cde_builtins.def, knowing that the predicated
versions come after the non-predicated versions.

The machine description patterns for these builtins are simpler than
those for the non-predicated versions, since the accumulator versions
*and* non-accumulator versions both need an input vector now.
The input vector is needed for the non-accumulator version to describe
the original values for those lanes that are not updated during the
merge operation.

We additionally need to introduce qualifiers for these new builtins,
which follow the same pattern as the non-predicated versions but with an
extra argument to describe the predicate.

Error message changes:
- We directly mention the builtin argument when complaining that an
  argument is not in the correct range.
  This more closely matches the C error messages.
- We ensure the resolver complains about *all* invalid arguments to a
  function instead of just the first one.
- The resolver error messages index arguments from 1 instead of 0 to
  match the arguments coming from the C/C++ frontend.

In order to allow the user to give an argument for the merging predicate
when they don't care what data is stored in the 'false' lanes, we also
move the __arm_vuninitializedq* intrinsics from arm_mve.h to
arm_mve_types.h which is shared with arm_cde.h.

We only move the fully type-specified `__arm_vuninitializedq*`
intrinsics and not the polymorphic versions, since moving the
polymorphic versions requires moving the _Generic framework as well as
just the intrinsics we're interested in.  This matches the approach taken
for the `__arm_vreinterpret*` functions in this include file.

This patch also contains a slight change in spacing of an existing
assembly instruction to be emitted.
This is just to help writing tests -- vmsr usually has a tab and a space
between the mnemonic and the first argument, but in one case it just has
a tab -- making all the same helps make test regexps simpler.

Testing Done:
Bootstrap and full regtest on arm-none-linux-gnueabihf
Full regtest on arm-none-eabi

All testing done with a local fix for the bugzilla PR below.
That bugzilla currently causes multiple ICE's on the tests added in
this patch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94341

gcc/ChangeLog:

2020-04-07  Matthew Malcomson  

* config/arm/arm-builtins.c (CX_UNARY_UNONE_QUALIFIERS): New.
(CX_BINARY_UNONE_QUALIFIERS): New.
(CX_TERNARY_UNONE_QUALIFIERS): New.
(arm_resolve_overloaded_builtin): Move to arm-c.c.
(arm_expand_builtin_args): Update error message.
(enum resolver_ident): New.
(arm_describe_resolver): New.
(arm_cde_end_args): New.
* config/arm/arm-builtins.h: New file.
* config/arm/arm-c.c (arm_resolve_overloaded_builtin): New.
(arm_resolve_cde_builtin): Moved from arm-builtins.c.
* config/arm/arm_cde.h (__arm_vcx1q_m, __arm_vcx1qa_m,
__arm_vcx2q_m, __arm_vcx2qa_m, __arm_vcx3q_m, __arm_vcx3qa_m):
New.
* config/arm/arm_cde_builtins.def (vcx1q_p_, vcx1qa_p_,
vcx2q_p_, vcx2qa_p_, vcx3q_p_, vcx3qa_p_): New builtin defs.
* config/arm/iterators.md (CDE_VCX): New int iterator.
(a) New int attribute.
* config/arm/mve.md (arm_vcx1q_p_v16qi, arm_vcx2q_p_v16qi,
arm_vcx3q_p_v16qi): New patterns.
* config/arm/vfp.md (thumb2_movhi_fp16): Extra space in assembly.

gcc/testsuite/ChangeLog:

2020-04-07  Matthew Malcomson  

* gcc.target/arm/acle/cde-errors.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-error-1.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-error-2.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-error-3.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-full-assembly.c: Add predicated
forms.
* gcc.target/arm/acle/cde-mve-tests.c: Add predicated forms.
* gcc.target/arm/acle/cde_v_1_err.c

[Arm] Implement CDE intrinsics for MVE registers.

2020-03-26 Thread Matthew Malcomson
[Arm] Implement CDE intrinsics for MVE registers.

Implement CDE intrinsics on MVE registers.

Other than the basics required for adding intrinsics this patch consists
of three changes.

** We separate out the MVE types and casts from the arm_mve.h header.

This is so that the types can be used in arm_cde.h without the need to include
the entire arm_mve.h header.
The only type that arm_cde.h needs is `uint8x16_t`, so this separation could be
avoided by using a `typedef` in this file.
Since the introduced intrinsics are all defined to act on the full range of MVE
types, declaring all such types seems intuitive since it will provide their
declaration to the user too.

This arm_mve_types.h header not only includes the MVE types, but also
the conversion intrinsics between them.
Some of the conversion intrinsics are needed for arm_cde.h, but most are
not.  We include all conversion intrinsics to keep the definition of
such conversion functions all in one place, on the understanding that
extra conversion functions being defined when including `arm_cde.h` is
not a problem.

** We define the TARGET_RESOLVE_OVERLOADED_BUILTIN hook for the Arm backend.

This is needed to implement the polymorphism for the required intrinsics.
The intrinsics have no specialised version, and the resulting assembly
instruction for all different types should be exactly the same.
Due to this we have implemented these intrinsics via one builtin on one type.
All other calls to the intrinsic with different types are implicitly cast to
the one type that is defined, and hence are all expanded to the same RTL
pattern that is only defined for one machine mode.

** We seperate the initialisation of the CDE intrinsics from others.

This allows us to ensure that the CDE intrinsics acting on MVE registers
are only created when both CDE and MVE are available.
Only initialising these builtins when both features are available is
especially important since they require a type that is only initialised
when the target supports hard float.  Hence trying to initialise these
builtins on a soft float target would cause an ICE.

Testing done:
  Full bootstrap and regtest on arm-none-linux-gnueabihf
  Regression test on arm-none-eabi

NOTE: These tests ICE on arm-none-eabi.
This is due to bugzilla bug 94341, and not due to a problem in this patch.

NOTE2: This patch depends on the CDE intrinsic framework patch by
Dennis.
https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542008.html


Ok for trunk?

gcc/ChangeLog:

2020-03-26  Matthew Malcomson  

* config.gcc (arm_mve_types.h): New extra_header for arm.
* config/arm/arm-builtins.c (arm_resolve_overloaded_builtin): New.
(arm_init_cde_builtins): New.
(arm_init_acle_builtins): Remove initialisation of CDE builtins.
(arm_init_builtins): Call arm_init_cde_builtins when target
supports CDE.
* config/arm/arm-c.c (arm_resolve_overloaded_builtin): New declaration.
(arm_register_target_pragmas): Initialise resolve_overloaded_builtin
hook to the implementation for the arm backend.
* config/arm/arm.h (ARM_MVE_CDE_CONST_1): New.
(ARM_MVE_CDE_CONST_2): New.
(ARM_MVE_CDE_CONST_3): New.
* config/arm/arm_cde.h (__arm_vcx1q_u8): New.
(__arm_vcx1qa): New.
(__arm_vcx2q): New.
(__arm_vcx2q_u8): New.
(__arm_vcx2qa): New.
(__arm_vcx3q): New.
(__arm_vcx3q_u8): New.
(__arm_vcx3qa): New.
* config/arm/arm_cde_builtins.def (vcx1q, vcx1qa, vcx2q, vcx2qa, vcx3q,
vcx3qa): New builtins defined.
* config/arm/arm_mve.h: Move typedefs and conversion intrinsics
to arm_mve_types.h header.
* config/arm/arm_mve_types.h: New file.
* config/arm/mve.md (arm_vcx1qv16qi, arm_vcx1qav16qi, arm_vcx2qv16qi,
arm_vcx2qav16qi, arm_vcx3qv16qi, arm_vcx3qav16qi): New patterns.
* config/arm/predicates.md (const_int_mve_cde1_operand,
const_int_mve_cde2_operand, const_int_mve_cde3_operand): New.

gcc/testsuite/ChangeLog:

2020-03-26  Matthew Malcomson  
Dennis Zhang  

* gcc.target/arm/acle/cde-mve-error-1.c: New test.
* gcc.target/arm/acle/cde-mve-error-2.c: New test.
* gcc.target/arm/acle/cde-mve-error-3.c: New test.
* gcc.target/arm/acle/cde-mve-full-assembly.c: New test.
* gcc.target/arm/acle/cde-mve-tests.c: New test.
* lib/target-supports.exp (arm_v8_1m_main_cde_mve_fp): New check
effective.


cde_mve_regs.patch.gz
Description: application/gzip


Fwd: [testsuite] Add @ lines to check-function-bodies fluff

2020-03-10 Thread Matthew Malcomson

Cc'ing maintainers and original author of `check-function-bodies`.

It looks like I missed that the first time around.


 Forwarded Message 
Subject: [testsuite] Add @ lines to check-function-bodies fluff
Date: Tue, 10 Mar 2020 17:22:52 +
From: Matthew Malcomson 
To: gcc-patches@gcc.gnu.org
CC: n...@arm.com

When using `check-function-bodies`, the subroutine 
`parse_function_bodies` uses

the `fluff` regexp to remove uninteresting assembly lines.

Arm targets generate assembly with some lines prefixed by `@`, these 
lines are

left by this process.

As an example of some lines prefixed by `@': the assembly output from the
`stacktest1` function in "bfloat16_simd_3_1.c" is:

.align  2
.global stacktest1
.arch armv8.2-a
.syntax unified
.arm
.fpu neon-fp-armv8
.type   stacktest1, %function
stacktest1:
@ args = 0, pretend = 0, frame = 8
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
sub sp, sp, #8
add r3, sp, #6
vst1.16 {d0[0]}, [r3]
vld1.16 {d0[0]}, [r3]
add sp, sp, #8
@ sp needed
bx  lr
.size   stacktest1, .-stacktest1



It seems that previous uses of `check-function-bodies` in the arm 
backend have
avoided problems with such lines since they use the `...` regexp in each 
place

such fluff occurs.

I'm currently writing a patch that I'd like to match the entire function 
body,

so I'd like to remove such `@` lines automatically.

gcc/testsuite/ChangeLog:

2020-03-10  Matthew Malcomson  

* lib/scanasm.exp (parse_function_bodies): Lines starting with '@' also
counted as fluff.



### Attachment also inlined for ease of reply 
###



diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index 
5ca58d4042027683da12bc2a1d161195cd6439e7..f7d27735112f8edd8a39a326020c3d08dd36e765 
100644

--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -569,7 +569,7 @@ proc parse_function_bodies { filename result } {
 set terminator {^\s*\.size}
  # Regexp for lines that aren't interesting.
-set fluff {^\s*(?:\.|//)}
+set fluff {^\s*(?:\.|//|@)}
  set fd [open $filename r]
 set in_function 0


diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index 
5ca58d4042027683da12bc2a1d161195cd6439e7..f7d27735112f8edd8a39a326020c3d08dd36e765
 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -569,7 +569,7 @@ proc parse_function_bodies { filename result } {
 set terminator {^\s*\.size}
 
 # Regexp for lines that aren't interesting.
-set fluff {^\s*(?:\.|//)}
+set fluff {^\s*(?:\.|//|@)}
 
 set fd [open $filename r]
 set in_function 0



[testsuite] Add @ lines to check-function-bodies fluff

2020-03-10 Thread Matthew Malcomson
When using `check-function-bodies`, the subroutine `parse_function_bodies` uses
the `fluff` regexp to remove uninteresting assembly lines.

Arm targets generate assembly with some lines prefixed by `@`, these lines are
left by this process.

As an example of some lines prefixed by `@': the assembly output from the
`stacktest1` function in "bfloat16_simd_3_1.c" is:

.align  2
.global stacktest1
.arch armv8.2-a
.syntax unified
.arm
.fpu neon-fp-armv8
.type   stacktest1, %function
stacktest1:
@ args = 0, pretend = 0, frame = 8
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
sub sp, sp, #8
add r3, sp, #6
vst1.16 {d0[0]}, [r3]
vld1.16 {d0[0]}, [r3]
add sp, sp, #8
@ sp needed
bx  lr
.size   stacktest1, .-stacktest1



It seems that previous uses of `check-function-bodies` in the arm backend have
avoided problems with such lines since they use the `...` regexp in each place
such fluff occurs.

I'm currently writing a patch that I'd like to match the entire function body,
so I'd like to remove such `@` lines automatically.

gcc/testsuite/ChangeLog:

2020-03-10  Matthew Malcomson  

* lib/scanasm.exp (parse_function_bodies): Lines starting with '@' also
counted as fluff.



### Attachment also inlined for ease of reply###


diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index 
5ca58d4042027683da12bc2a1d161195cd6439e7..f7d27735112f8edd8a39a326020c3d08dd36e765
 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -569,7 +569,7 @@ proc parse_function_bodies { filename result } {
 set terminator {^\s*\.size}
 
 # Regexp for lines that aren't interesting.
-set fluff {^\s*(?:\.|//)}
+set fluff {^\s*(?:\.|//|@)}
 
 set fd [open $filename r]
 set in_function 0

diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index 
5ca58d4042027683da12bc2a1d161195cd6439e7..f7d27735112f8edd8a39a326020c3d08dd36e765
 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -569,7 +569,7 @@ proc parse_function_bodies { filename result } {
 set terminator {^\s*\.size}
 
 # Regexp for lines that aren't interesting.
-set fluff {^\s*(?:\.|//)}
+set fluff {^\s*(?:\.|//|@)}
 
 set fd [open $filename r]
 set in_function 0



[AArch64] effective_target for aarch64 f64mm asm

2020-01-21 Thread Matthew Malcomson
Commit 9ceec73 introduced intrinsics for the AArch64 FP64 matrix
multiply instructions.  These require binutils support for the same
instructions.
( See https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01234.html for the
testsuite failures this introduced. )

This patch adds a DejaGNU test to ensure this binutils support is there
and uses it in the files that need this test.

NOTE:
I tried to find some way to run the assembly tests if the given version
of binutils is available, but run the compile tests if not.  This is
pretty awkward -- It seems I either have to duplicate all the DejaGNU
comments between two files, or write filename exceptions into the
list of files that aarch64-sve-acle-asm.exp runs tests for.

I decided to not do either, since I figure not running the tests on
older binutils isn't too bad compared to having a bunch more DejaGNU
stuff making the tests harder to read.

Testing Done:
Checked on a cross-compiler that:
Tests running for binutils commit e264b5b7a are listed as UNSUPPORTED.
Tests running for binutils commit 26916852e all pass.

gcc/testsuite/ChangeLog:

2020-01-21  Matthew Malcomson  

* gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c: Use require
directive.
* gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c: Likewise.
* lib/target-supports.exp: Add assembly requirement directive.



### Attachment also inlined for ease of reply###


diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c
index 
7badc75a43ab2009e9406afc04c980fc01834716..6eb94f1ca5fda961bb23f5c4cf66bc5694d26f36
 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c
@@ -1,5 +1,6 @@
 /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */
 /* { dg-additional-options "-march=armv8.6-a+sve+f64mm" } */
+/* { dg-require-effective-target aarch64_asm_f64mm_ok }  */
 
 #include "test_sve_acle.h"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c
index 
dd8a1c53cd0fb7b7acd0b92394f3977382ac26e0..0a77c37ddd5978db765b4bad23f24da82a0aed11
 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c
@@ -1,5 +1,6 @@
 /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */
 /* { dg-additional-options "-march=armv8.6-a+sve+f64mm" } */
+/* { dg-require-effective-target aarch64_asm_f64mm_ok }  */
 
 #include "test_sve_acle.h"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c
index 
30563698310f65060d34be4bef4c57a74ef9d734..65c6d9b02b804a1480cd5014c8f0ca5f534bacbe
 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c
@@ -1,5 +1,6 @@
 /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */
 /* { dg-additional-options "-march=armv8.6-a+sve+f64mm" } */
+/* { dg-require-effective-target aarch64_asm_f64mm_ok }  */
 
 #include "test_sve_acle.h"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c
index 
d4702fa6cc15e9f93751d8579cfecfd37759306e..e3dc9bd51cf93c2f97e4277181e686d0bf53a1ea
 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c
@@ -1,5 +1,6 @@
 /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */
 /* { dg-additional-options "-march=armv8.6-a+sve+f64mm" } */
+/* { dg-require-effective-target aarch64_asm_f64mm_ok }  */
 
 #include "test_sve_acle.h"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c
index 
4604b0b5fbfb716ae814bf88f7acfe8bf0eaa9f5..f3af8e5cc25791a56930618abf5c092d51e94e9b
 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c
@@ -1,5 +1,6 @@
 /

Re: [Patch] [AArch64] [SVE] Implement svld1ro intrinsic.

2020-01-20 Thread Matthew Malcomson
On 20/01/2020 14:53, Christophe Lyon wrote:
> On Thu, 9 Jan 2020 at 16:53, Matthew Malcomson
>  wrote:
>>
>> +   (match_test "aarch64_sve_ld1ro_operand_p (op, DImode)")))
>> +
> 
> 
>>   (define_predicate "aarch64_sve_ldff1_operand"
>> (and (match_code "mem")
> 
>>  (match_test "aarch64_sve_ldff1_operand_p (op)")))
> 
> 
> Hi,
> 
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c 
>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c
>> new file mode 100644
>> index 
>> ..7badc75a43ab2009e9406afc04c980fc01834716
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c
>> @@ -0,0 +1,119 @@
>> +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */
>> +/* { dg-additional-options "-march=armv8.6-a+sve+f64mm" } */
> 
> What is the binutils version requirement for this?
> Some validations using binutils-2.33.1 exhibit failures like:
> /xgcc 
> -B/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/gcc3/gcc/
> -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
> -fdiagnostics-color=never -fdiagnostics-urls=never -std=c90 -O0 -g
> -DTEST_FULL -march=armv8.2-a+sve -fno-ipa-icf
> -march=armv8.6-a+sve+f64mm -c -o ld1ro_s16.o
> /gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c
> Assembler messages:
> Error: unknown architecture `armv8.6-a+sve+f64mm'
> 
> Error: unrecognized option -march=armv8.6-a+sve+f64mm
> compiler exited with status 1
> FAIL: gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c  -std=c90 -O0 -g
> -DTEST_FULL  1 blank line(s) in output
> FAIL: gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c  -std=c90 -O0 -g
> -DTEST_FULL (test for excess errors)
> Excess errors:
> Assembler messages:
> Error: unknown architecture `armv8.6-a+sve+f64mm'
> Error: unrecognized option -march=armv8.6-a+sve+f64mm
> 
> 
> while other configurations using 2.32 binutils seem to pass this test:
> /xgcc 
> -B/home/tcwg-buildslave/workspace/tcwg-buildfarm__0/_build/builds/aarch64-unknown-linux-gnu/aarch64-unknown-linux-gnu/gcc.git~master_rev_3684bbb022cd75da55e1457673f269980aa12cdf-stage2/gcc/
> /home/tcwg-buildslave/workspace/tcwg-buildfarm__0/snapshots/gcc.git~master_rev_3684bbb022cd75da55e1457673f269980aa12cdf/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c
> -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
> -fdiagnostics-color=never -fdiagnostics-urls=never -std=c90 -O0 -g
> -DTEST_FULL -march=armv8.2-a+sve -fno-ipa-icf
> -march=armv8.6-a+sve+f64mm -S -o ld1ro_f16.s
> PASS: gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c  -std=c90 -O0 -g
> -DTEST_FULL (test for excess errors)
> 
> Ha... took me a while to realize that in the latter case we stop after
> generating the .s file and do not call the assembler...
> 
> 
> So... do we want/need additional consistency checks between gcc and
> gas versions?
> 

Ah! Yes, that should certainly be done.
I'll start that now.

Thanks for pointing it out -- it seems I really did not take enough care 
with this patch...

MM

> 
> Thanks,
> 
> Christophe
> 
> 
>> +



[PATCH][AArch64] Enable CLI for Armv8.6-A f64mm

2020-01-13 Thread Matthew Malcomson
This patch is necessary for sve-ld1ro intrinsic I posted in
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00466.html .

I had mistakenly thought this option was already enabled upstream.

This provides the option +f64mm, that turns on the 64 bit floating point
matrix multiply extension.  This extension is only available for
AArch64.  Turning on this extension also turns on the SVE extension.

This extension is optional and only available at Armv8.2-A and onward.

We also add the ACLE defined macro for this extension.

Tested on aarch64 cross compiler from x86.

gcc/ChangeLog:

2020-01-13  Matthew Malcomson  

* config/aarch64/aarch64-c.c (_ARM_FEATURE_MATMUL_FLOAT64):
Introduce this ACLE specified predefined macro.
* config/aarch64/aarch64-option-extensions.def (f64mm): New.
(fp): Disabling this disables f64mm.
(simd): Disabling this disables f64mm.
(fp16): Disabling this disables f64mm.
(sve): Disabling this disables f64mm.
* config/aarch64/aarch64.h (AARCH64_FL_F64MM): New.
(AARCH64_ISA_F64MM): New.
(TARGET_F64MM): New.
* doc/invoke.texi (f64mm): Document new option.

gcc/testsuite/ChangeLog:

2020-01-13  Matthew Malcomson  

* gcc.target/aarch64/pragma_cpp_predefs_2.c: Check for f64mm
predef.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index 
9ccca422b045fcd6875f192680220b2af9d59d31..22fc4a5e337acb4cdd96618b57736f55892bd0e7
 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -166,6 +166,7 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
   aarch64_def_or_undef (TARGET_MEMTAG, "__ARM_FEATURE_MEMORY_TAGGING", pfile);
 
   aarch64_def_or_undef (TARGET_I8MM, "__ARM_FEATURE_MATMUL_INT8", pfile);
+  aarch64_def_or_undef (TARGET_F64MM, "__ARM_FEATURE_MATMUL_FP64", pfile);
   aarch64_def_or_undef (TARGET_BF16_SIMD,
"__ARM_FEATURE_BF16_VECTOR_ARITHMETIC", pfile);
   aarch64_def_or_undef (TARGET_BF16_FP,
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def 
b/gcc/config/aarch64/aarch64-option-extensions.def
index 
5022a1b3552f35364e35b3955bf2c39a33ab0752..6057b33033a0f3a8be7d656dc1e459d4d93b2842
 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -53,26 +53,26 @@
 /* Enabling "fp" just enables "fp".
Disabling "fp" also disables "simd", "crypto", "fp16", "aes", "sha2",
"sha3", sm3/sm4, "sve", "sve2", "sve2-aes", "sve2-sha3", "sve2-sm4",
-   "sve2-bitperm", "i8mm" and "bf16".  */
+   "sve2-bitperm", "i8mm", "f64mm", and "bf16".  */
 AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, 0, AARCH64_FL_SIMD | \
  AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | \
  AARCH64_FL_SHA2 | AARCH64_FL_SHA3 | AARCH64_FL_SM4 | \
  AARCH64_FL_SVE | AARCH64_FL_SVE2 | AARCH64_FL_SVE2_AES | \
  AARCH64_FL_SVE2_SHA3 | AARCH64_FL_SVE2_SM4 | \
  AARCH64_FL_SVE2_BITPERM | AARCH64_FL_I8MM | \
- AARCH64_FL_BF16, false, "fp")
+ AARCH64_FL_F64MM | AARCH64_FL_BF16, false, "fp")
 
 /* Enabling "simd" also enables "fp".
Disabling "simd" also disables "crypto", "dotprod", "aes", "sha2", "sha3",
"sm3/sm4", "sve", "sve2", "sve2-aes", "sve2-sha3", "sve2-sm4",
-   "sve2-bitperm", and "i8mm".  */
+   "sve2-bitperm", "i8mm", and "f64mm".  */
 AARCH64_OPT_EXTENSION("simd", AARCH64_FL_SIMD, AARCH64_FL_FP, \
  AARCH64_FL_CRYPTO | AARCH64_FL_DOTPROD | \
  AARCH64_FL_AES | AARCH64_FL_SHA2 | AARCH64_FL_SHA3 | \
  AARCH64_FL_SM4 | AARCH64_FL_SVE | AARCH64_FL_SVE2 | \
  AARCH64_FL_SVE2_AES | AARCH64_FL_SVE2_SHA3 | \
  AARCH64_FL_SVE2_SM4 | AARCH64_FL_SVE2_BITPERM | \
- AARCH64_FL_I8MM, false, \
+ AARCH64_FL_I8MM | AARCH64_FL_F64MM, false, \
  "asimd")
 
 /* Enabling "crypto" also enables "fp", "simd", "aes" and "sha2".
@@ -92,12 +92,13 @@ AARCH64_OPT_EXTENSION("crc", AARCH64_FL_CRC, 0, 0, false, 
"crc32")
 AARCH64_OPT_EXTENSION("lse", AARCH64_FL_LSE, 0, 0, false, "atomics")
 
 /* Enabling "fp16" also enables &q

Re: [PATCH 6/X] [libsanitizer] Add hwasan pass and associated gimple changes

2020-01-13 Thread Matthew Malcomson
> On 12/12/19 4:19 PM, Matthew Malcomson wrote:
>> - if (is_store && !param_asan_instrument_writes)
>> + if (is_store
>> + && (!param_asan_instrument_writes || !param_hwasan_instrument_writes))
>> return;
>> - if (!is_store && !param_asan_instrument_reads)
>> + if (!is_store
>> + && (!param_asan_instrument_reads || !param_hwasan_instrument_reads))
>> return;
> 
> I know it's very unlikely, but one can use -fsanitize=address and
> --param hwasan-instrument-reads=0 which will drop instrumentation of reads
> for ASAN.

Ah! Thanks for the catch.
Updated patch is attached and has been tested.

I've also attached the patch including the new `Optimization` keyword to the
hwasan parameters to this email -- (putting both on this email to avoid a bit of
email spam).

> 
> Similarly for other parameters.
> 
> Martin


Inlining the new bit that avoids the problem you pointed out above, since
the implementation of that is the only new part someone might object to.


###

diff --git a/gcc/asan.c b/gcc/asan.c
index 
fe6841b4f084f75be534cc9653079ca0a5bdc94e..55723bf4d5d2a4111eb574d169f21332f6eb33ff
 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -326,6 +326,25 @@ asan_sanitize_allocas_p (void)
   return (asan_sanitize_stack_p () && param_asan_protect_allocas);
 }
 
+bool
+asan_instrument_reads (void)
+{
+  return (sanitize_flags_p (SANITIZE_ADDRESS) && param_asan_instrument_reads);
+}
+
+bool
+asan_instrument_writes (void)
+{
+  return (sanitize_flags_p (SANITIZE_ADDRESS) && param_asan_instrument_writes);
+}
+
+bool
+asan_memintrin (void)
+{
+  return (sanitize_flags_p (SANITIZE_ADDRESS) && param_asan_memintrin);
+}
+
+
 /* Checks whether section SEC should be sanitized.  */
 
 static bool
@@ -1382,6 +1673,28 @@ hwasan_sanitize_allocas_p (void)
   return (hwasan_sanitize_stack_p () && param_hwasan_protect_allocas);
 }
 
+/* Should we instrument reads?  */
+bool
+hwasan_instrument_reads (void)
+{
+  return (hwasan_sanitize_p () && param_hwasan_instrument_reads);
+}
+
+/* Should we instrument writes?  */
+bool
+hwasan_instrument_writes (void)
+{
+  return (hwasan_sanitize_p () && param_hwasan_instrument_writes);
+}
+
+/* Should we instrument builtin calls?  */
+bool
+hwasan_memintrin (void)
+{
+  return (hwasan_sanitize_p () && param_hwasan_memintrin);
+}
+
+
 /* Insert code to protect stack vars.  The prologue sequence should be emitted
directly, epilogue sequence returned.  BASE is the register holding the
stack base, against which OFFSETS array offsets are relative to, OFFSETS
@@ -2220,9 +2539,9 @@ static void
 instrument_derefs (gimple_stmt_iterator *iter, tree t,
   location_t location, bool is_store)
 {
-  if (is_store && !param_asan_instrument_writes)
+  if (is_store && !(asan_instrument_writes () || hwasan_instrument_writes ()))
 return;
-  if (!is_store && !param_asan_instrument_reads)
+  if (!is_store && !(asan_instrument_reads () || hwasan_instrument_reads ()))
 return;
 
   tree type, base;
@@ -2376,7 +2696,7 @@ instrument_mem_region_access (tree base, tree len,
 static bool
 instrument_builtin_call (gimple_stmt_iterator *iter)
 {
-  if (!param_asan_memintrin)
+  if (!(asan_memintrin () || hwasan_memintrin ()))
 return false;
 
   bool iter_advanced_p = false;

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 
77649e839c968c38485ea30d332576b9b089effe..c5746e3c89283ef0652de85cf6b94e76263aba38
 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -54,6 +54,8 @@ static tree handle_cold_attribute (tree *, tree, tree, int, 
bool *);
 static tree handle_no_sanitize_attribute (tree *, tree, tree, int, bool *);
 static tree handle_no_sanitize_address_attribute (tree *, tree, tree,
  int, bool *);
+static tree handle_no_sanitize_hwaddress_attribute (tree *, tree, tree,
+   int, bool *);
 static tree handle_no_sanitize_thread_attribute (tree *, tree, tree,
 int, bool *);
 static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree,
@@ -412,6 +414,8 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_no_sanitize_attribute, NULL },
   { "no_sanitize_address",0, 0, true, false, false, false,
  handle_no_sanitize_address_attribute, NULL },
+  { "no_sanitize_hwaddress",0, 0, true, false, false, false,
+ handle_no_sanitize_hwaddress_attribute, NULL },
   { "no_sanitize_thread", 0, 0, true, false, false, false,
  handle_no_sanitize_thread_attribute, NULL 

Re: Document --with-build-config=bootstrap-asan option.

2020-01-13 Thread Matthew Malcomson
On 11/01/2020 07:19, Gerald Pfeifer wrote:
> On Thu, 12 Dec 2019, Matthew Malcomson wrote:
>> gcc/ChangeLog:
>>
>> 2019-12-12  Matthew Malcomson  
>>
>>  * doc/install.texi: Document bootstrap-asan configuration option.
> 
> I see this introduces a new table.
> 
>> +Some examples of build configurations designed for developers of GCC are:
> 
> @samp{bootstrap-time}, @samp{bootstrap-debug-ckovw} and others appear
> to fall into the same camp, essentially expected to be used by maintainers
> only.
> 
> Would it make sense to add your new option to the existing table, or
> perhaps see which other options from the existing table to move into
> your new one?  Thoughts?

Sounds good me.


> 
> 
> The patch is okay modulo the question above.
> 
> Thanks,
> Gerald
> 

Patch with above suggestion.

#

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 
80b47812fe66a8ef50edf3aad9708ab3409ba7dc..0705759c69f64c6d06e91f7ae83bb8c1ad210f34
 
100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2668,6 +2668,10 @@ Arranges for the run time of each program started 
by the GCC driver,
  built in any stage, to be logged to @file{time.log}, in the top level of
  the build tree.

+@item @samp{bootstrap-asan}
+Compiles GCC itself using Address Sanitization in order to catch 
invalid memory
+accesses within the GCC code.
+
  @end table

  @section Building a cross compiler



[Patch] [AArch64] [SVE] Implement svld1ro intrinsic.

2020-01-09 Thread Matthew Malcomson
We take no action to ensure the SVE vector size is large enough.  It is
left to the user to check that before compiling this intrinsic or before
running such a program on a machine.

The main difference between ld1ro and ld1rq is in the allowed offsets,
the implementation difference is that ld1ro is implemented using integer
modes since there are no pre-existing vector modes of the relevant size.
Adding new vector modes simply for this intrinsic seems to make the code
less tidy.

Specifications can be found under the "Arm C Language Extensions for
Scalable Vector Extension" title at
https://developer.arm.com/architectures/system-architectures/software-standards/acle

gcc/ChangeLog:

2020-01-09  Matthew Malcomson  

* config/aarch64/aarch64-protos.h
(aarch64_sve_ld1ro_operand_p): New.
* config/aarch64/aarch64-sve-builtins-base.cc
(class load_replicate): New.
(class svld1ro_impl): New.
(class svld1rq_impl): Change to inherit from load_replicate.
(svld1ro): New sve intrinsic function base.
* config/aarch64/aarch64-sve-builtins-base.def (svld1ro):
New DEF_SVE_FUNCTION.
* config/aarch64/aarch64-sve-builtins-base.h
(svld1ro): New decl.
* config/aarch64/aarch64-sve-builtins.cc
(function_expander::add_mem_operand): Modify assert to allow
OImode.
* config/aarch64/aarch64-sve.md (@aarch64_sve_ld1ro): New
pattern.
* config/aarch64/aarch64.c
(aarch64_sve_ld1rq_operand_p): Implement in terms of ...
(aarch64_sve_ld1rq_ld1ro_operand_p): This.
(aarch64_sve_ld1ro_operand_p): New.
* config/aarch64/aarch64.md (UNSPEC_LD1RO): New unspec.
* config/aarch64/constraints.md (UOb,UOh,UOw,UOd): New.
* config/aarch64/predicates.md
(aarch64_sve_ld1ro_operand_{b,h,w,d}): New.

gcc/testsuite/ChangeLog:

2020-01-09  Matthew Malcomson  

* gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c: New test.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
c16b9362ea986ff221755bfc4d10bae674a67ed4..6d2162b93932e433677dae48e5c58975be2902d2
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -582,6 +582,7 @@ rtx aarch64_simd_gen_const_vector_dup (machine_mode, 
HOST_WIDE_INT);
 bool aarch64_simd_mem_operand_p (rtx);
 bool aarch64_sve_ld1r_operand_p (rtx);
 bool aarch64_sve_ld1rq_operand_p (rtx);
+bool aarch64_sve_ld1ro_operand_p (rtx, scalar_mode);
 bool aarch64_sve_ldff1_operand_p (rtx);
 bool aarch64_sve_ldnf1_operand_p (rtx);
 bool aarch64_sve_ldr_operand_p (rtx);
diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index 
38bd3adce1ebbde4c58531ffd26eedd4ae4938b0..e52a6012565fadd84cdd77a613f887e5ae53a576
 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -1139,7 +1139,7 @@ public:
   }
 };
 
-class svld1rq_impl : public function_base
+class load_replicate : public function_base
 {
 public:
   unsigned int
@@ -1153,7 +1153,11 @@ public:
   {
 return fi.scalar_type (0);
   }
+};
 
+class svld1rq_impl : public load_replicate
+{
+public:
   machine_mode
   memory_vector_mode (const function_instance ) const OVERRIDE
   {
@@ -1168,6 +1172,23 @@ public:
   }
 };
 
+class svld1ro_impl : public load_replicate
+{
+public:
+  machine_mode
+  memory_vector_mode (const function_instance ) const OVERRIDE
+  {
+return OImode;
+  }
+
+  rtx
+  expand (function_expander ) const OVERRIDE
+  {
+insn_code icode = code_for_aarch64_sve_ld1ro (e.vector_mode (0));
+return e.use_contiguous_load_insn (icode);
+  }
+};
+
 /* Implements svld2, svld3 and svld4.  */
 class svld234_impl : public full_width_access
 {
@@ -2571,6 +2592,7 @@ FUNCTION (svlasta, svlast_impl, (UNSPEC_LASTA))
 FUNCTION (svlastb, svlast_impl, (UNSPEC_LASTB))
 FUNCTION (svld1, svld1_impl,)
 FUNCTION (svld1_gather, svld1_gather_impl,)
+FUNCTION (svld1ro, svld1ro_impl,)
 FUNCTION (svld1rq, svld1rq_impl,)
 FUNCTION (svld1sb, svld1_extend_impl, (TYPE_SUFFIX_s8))
 FUNCTION (svld1sb_gather, svld1_gather_extend_impl, (TYPE_SUFFIX_s8))
diff --git a/gcc/conf

Re: [Patch 0/X] HWASAN v3

2020-01-08 Thread Matthew Malcomson
Hi everyone,

I'm writing this email to summarise & publicise the state of this patch 
series, especially the difficulties around approval for GCC 10 mentioned 
on IRC.


The main obstacle seems to be that no maintainer feels they have enough 
knowledge about hwasan and justification that it's worthwhile to approve 
the patch series.

Similarly, Martin has given a review of the parts of the code he can 
(thanks!), but doesn't feel he can do a deep review of the code related 
to the RTL hooks and stack expansion -- hence that part is as yet not 
reviewed in-depth.



The questions around justification raised on IRC are mainly that it 
seems like a proof-of-concept for MTE rather than a stand-alone useable 
sanitizer.  Especially since in the GNU world hwasan instrumented code 
is not really ready for production since we can only use the 
less-"interceptor ABI" rather than the "platform ABI".  This restriction 
is because there is no version of glibc with the required modifications 
to provide the "platform ABI".

(n.b. that since https://reviews.llvm.org/D69574 the code-generation for 
these ABI's is the same).


 From my perspective the reasons that make HWASAN useful in itself are:

1) Much less memory usage.

 From a back-of-the-envelope calculation based on the hwasan paper's 
table of memory overhead from over-alignment 
https://arxiv.org/pdf/1802.09517.pdf  I guess hwasan instrumented code 
has an overhead of about 1.1x (~4% from overalignment and ~6.25% from 
shadow memory), while asan seems to have an overhead somewhere in the 
range 1.5x - 3x.

Maybe there's some data out there comparing total overheads that I 
haven't found? (I'd appreciate a reference if anyone has that info).



2) Available on more architectures that MTE.

HWASAN only requires TBI, which is a feature of all AArch64 machines, 
while MTE will be an optional extension and only available on certain 
architectures.


3) This enables using hwasan in the kernel.

While instrumented user-space applications will be using the 
"interceptor ABI" and hence are likely not production-quality, the 
biggest aim of implementing hwasan in GCC is to allow building the Linux 
kernel with tag-based sanitization using GCC.

Instrumented kernel code uses hooks in the kernel itself, so this ABI 
distinction is no longer relevant, and this sanitizer should produce a 
production-quality kernel binary.




I'm hoping I can find a maintainer willing to review and ACK this patch 
series -- especially with stage3 coming to a close soon.  If there's 
anything else I could do to help get someone willing up-to-speed then 
please just ask.


Cheers,
Matthew



On 07/01/2020 15:14, Martin Liška wrote:
> On 12/12/19 4:18 PM, Matthew Malcomson wrote:
> 
> Hello.
> 
> I've just sent few comments that are related to the v3 of the patch set.
> Based on the HWASAN (limited) knowledge the patch seems reasonable to me.
> I haven't looked much at the newly introduced RTL-hooks.
> But these seems to me isolated to the aarch64 port.
> 
> I can also verify that the patchset works on my aarch64 linux machine and
> hwasan.exp and asan.exp tests succeed.
> 
>> I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory 
>> tagging,
>> since I'm not sure the way I found to implement this would be 
>> acceptable.  The
>> inlined patch below works but it requires a special declaration 
>> instead of just
>> an ~#include~.
> 
> Knowing that, I would not bother with the printing of HWASAN_MARK.
> 
> Thanks for the series,
> Martin



[PING] Re: [Patch 0/X] HWASAN v3

2020-01-06 Thread Matthew Malcomson
Ping


On 17/12/2019 14:11, Matthew Malcomson wrote:
> I've noticed a few minor problems with this patch series after I sent it
> out (mostly testcase stuff, one documentation tidy-up, but also that one
> patch didn't bootstrap due to something fixed in a later patch).
> 
> I also rely on a documentation change that isn't part of the series.
> 
> I figure I should make this easy on anyone that wants to try the patch
> series out, so I'm attaching a compressed tarfile containing the entire
> patch series plus the additional documentation patch so it can all be
> applied at once with `git apply *`.
> 
> It's attached.
> 
> Matthew.
> 
> 
> 
> On 12/12/2019 15:18, Matthew Malcomson wrote:
>> Hello,
>>
>> I've gone through the suggestions Martin made and implemented  the ones I 
>> think
>> I can implement for GCC10.
>>
>> The two functionality changes in this version are:
>> Added the --param's hwasan-instrument-reads, hwasan-instrument-writes,
>> hwasan-instrument-allocas, hwasan-memintrin, options.  I.e. Those that asan 
>> has
>> and that make sense for hwasan.
>>
>> Avoided HWASAN_STACK_BACKGROUND in hwasan_increment_tag when using a
>> deterministic tagging approach.
>>
>>
>> There are a lot of extra comments and tests.
>>
>>
>> Bootstrapped and regtested on x86_64 and AArch64.
>> Bootstrapped with `--with-build-config=bootstrap-hwasan` on AArch64 and 
>> hwasan
>> features tested there.
>> Built the linux kernel using this feature and ran the test_kasan.ko testing 
>> to
>> check the this works for the kernel.
>> (NOTE: I actually did all the above testing before a search and replace of
>> `memory_tagging_p` for `hwasan_sanitize_p` and fixing a typo in the
>> `hwasan-instrument-allocas` parameter name, I will run all the tests again
>> before committing but figure I'll send this out now since I fully expect the
>> tests to still pass).
>>
>>
>> I noticed one extra testsuite failure from those mentioned in the previous
>> version emails: g++.dg/cpp2a/ucn2.C.
>> I believe this is HWASAN correctly catching a problem in the compiler.
>> I've logged the issue here 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92919 .
>>
>>
>> I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory tagging,
>> since I'm not sure the way I found to implement this would be acceptable.  
>> The
>> inlined patch below works but it requires a special declaration instead of 
>> just
>> an ~#include~.
>>
>>
>> diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
>> index a1bc081..d81eb12 100644
>> --- a/gcc/internal-fn.h
>> +++ b/gcc/internal-fn.h
>> @@ -101,10 +101,16 @@ extern void init_internal_fns ();
>>
>>extern const char *const internal_fn_name_array[];
>>
>> +
>> +extern bool hwasan_sanitize_p (void);
>>static inline const char *
>>internal_fn_name (enum internal_fn fn)
>>{
>> -  return internal_fn_name_array[(int) fn];
>> +  const char *ret = internal_fn_name_array[(int) fn];
>> +  if (! strcmp (ret, "ASAN_MARK")
>> +  && hwasan_sanitize_p ())
>> +return "HWASAN_MARK";
>> +  return ret;
>>}
>>
>>extern internal_fn lookup_internal_fn (const char *);
>>
>>
>> Entire patch series attached to cover letter.
>>



Re: [Patch 0/X] HWASAN v3

2019-12-17 Thread Matthew Malcomson
I've noticed a few minor problems with this patch series after I sent it 
out (mostly testcase stuff, one documentation tidy-up, but also that one 
patch didn't bootstrap due to something fixed in a later patch).

I also rely on a documentation change that isn't part of the series.

I figure I should make this easy on anyone that wants to try the patch 
series out, so I'm attaching a compressed tarfile containing the entire 
patch series plus the additional documentation patch so it can all be 
applied at once with `git apply *`.

It's attached.

Matthew.



On 12/12/2019 15:18, Matthew Malcomson wrote:
> Hello,
> 
> I've gone through the suggestions Martin made and implemented  the ones I 
> think
> I can implement for GCC10.
> 
> The two functionality changes in this version are:
> Added the --param's hwasan-instrument-reads, hwasan-instrument-writes,
> hwasan-instrument-allocas, hwasan-memintrin, options.  I.e. Those that asan 
> has
> and that make sense for hwasan.
> 
> Avoided HWASAN_STACK_BACKGROUND in hwasan_increment_tag when using a
> deterministic tagging approach.
> 
> 
> There are a lot of extra comments and tests.
> 
> 
> Bootstrapped and regtested on x86_64 and AArch64.
> Bootstrapped with `--with-build-config=bootstrap-hwasan` on AArch64 and hwasan
> features tested there.
> Built the linux kernel using this feature and ran the test_kasan.ko testing to
> check the this works for the kernel.
> (NOTE: I actually did all the above testing before a search and replace of
> `memory_tagging_p` for `hwasan_sanitize_p` and fixing a typo in the
> `hwasan-instrument-allocas` parameter name, I will run all the tests again
> before committing but figure I'll send this out now since I fully expect the
> tests to still pass).
> 
> 
> I noticed one extra testsuite failure from those mentioned in the previous
> version emails: g++.dg/cpp2a/ucn2.C.
> I believe this is HWASAN correctly catching a problem in the compiler.
> I've logged the issue here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92919 
> .
> 
> 
> I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory tagging,
> since I'm not sure the way I found to implement this would be acceptable.  The
> inlined patch below works but it requires a special declaration instead of 
> just
> an ~#include~.
> 
> 
> diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
> index a1bc081..d81eb12 100644
> --- a/gcc/internal-fn.h
> +++ b/gcc/internal-fn.h
> @@ -101,10 +101,16 @@ extern void init_internal_fns ();
>   
>   extern const char *const internal_fn_name_array[];
>   
> +
> +extern bool hwasan_sanitize_p (void);
>   static inline const char *
>   internal_fn_name (enum internal_fn fn)
>   {
> -  return internal_fn_name_array[(int) fn];
> +  const char *ret = internal_fn_name_array[(int) fn];
> +  if (! strcmp (ret, "ASAN_MARK")
> +  && hwasan_sanitize_p ())
> +return "HWASAN_MARK";
> +  return ret;
>   }
>   
>   extern internal_fn lookup_internal_fn (const char *);
> 
> 
> Entire patch series attached to cover letter.
> 



all-patches.tar.gz
Description: all-patches.tar.gz


Re: [PATCH 7/X] [libsanitizer] Add tests

2019-12-16 Thread Matthew Malcomson
I just remembered that the run tests all fail on cross builds.
This patch includes the effective target check to determine whether a hwasan
binary can be run on a given target.

We then add that target requirement to all tests which need to run.



Adding hwasan tests.

Only interesting thing here is that we have to make sure the tagging mechanism
is deterministic to avoid flaky tests.

gcc/testsuite/ChangeLog:

2019-12-16  Matthew Malcomson  

* c-c++-common/hwasan/aligned-alloc.c: New test.
* c-c++-common/hwasan/alloca-array-accessible.c: New test.
* c-c++-common/hwasan/alloca-gets-different-tag.c: New test.
* c-c++-common/hwasan/alloca-outside-caught.c: New test.
* c-c++-common/hwasan/arguments.c: New test.
* c-c++-common/hwasan/arguments-1.c: New test.
* c-c++-common/hwasan/arguments-2.c: New test.
* c-c++-common/hwasan/arguments-3.c: New test.
* c-c++-common/hwasan/asan-pr63316.c: New test.
* c-c++-common/hwasan/asan-pr70541.c: New test.
* c-c++-common/hwasan/asan-pr78106.c: New test.
* c-c++-common/hwasan/asan-pr79944.c: New test.
* c-c++-common/hwasan/asan-rlimit-mmap-test-1.c: New test.
* c-c++-common/hwasan/bitfield-1.c: New test.
* c-c++-common/hwasan/bitfield-2.c: New test.
* c-c++-common/hwasan/builtin-special-handling.c: New test.
* c-c++-common/hwasan/check-interface.c: New test.
* c-c++-common/hwasan/halt_on_error-1.c: New test.
* c-c++-common/hwasan/heap-overflow.c: New test.
* c-c++-common/hwasan/hwasan-poison-optimisation.c: New test.
* c-c++-common/hwasan/hwasan-thread-access-parent.c: New test.
* c-c++-common/hwasan/hwasan-thread-basic-failure.c: New test.
* c-c++-common/hwasan/hwasan-thread-clears-stack.c: New test.
* c-c++-common/hwasan/hwasan-thread-success.c: New test.
* c-c++-common/hwasan/kernel-defaults.c: New test.
* c-c++-common/hwasan/large-aligned-0.c: New test.
* c-c++-common/hwasan/large-aligned-1.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-0.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-1.c: New test.
* c-c++-common/hwasan/macro-definition.c: New test.
* c-c++-common/hwasan/no-sanitize-attribute.c: New test.
* c-c++-common/hwasan/param-instrument-reads-and-writes.c: New test.
* c-c++-common/hwasan/param-instrument-reads.c: New test.
* c-c++-common/hwasan/param-instrument-writes.c: New test.
* c-c++-common/hwasan/param-memintrin.c: New test.
* c-c++-common/hwasan/random-frame-tag.c: New test.
* c-c++-common/hwasan/sanity-check-pure-c.c: New test.
* c-c++-common/hwasan/setjmp-longjmp-0.c: New test.
* c-c++-common/hwasan/setjmp-longjmp-1.c: New test.
* c-c++-common/hwasan/stack-tagging-basic-0.c: New test.
* c-c++-common/hwasan/stack-tagging-basic-1.c: New test.
* c-c++-common/hwasan/stack-tagging-disable.c: New test.
* c-c++-common/hwasan/unprotected-allocas-0.c: New test.
* c-c++-common/hwasan/unprotected-allocas-1.c: New test.
* c-c++-common/hwasan/use-after-free.c: New test.
* c-c++-common/hwasan/vararray-outside-caught.c: New test.
* c-c++-common/hwasan/vararray-stack-restore-correct.c: New test.
* c-c++-common/hwasan/very-large-objects.c: New test.
* g++.dg/hwasan/hwasan.exp: New file.
* g++.dg/hwasan/rvo-handled.C: New test.
* gcc.dg/hwasan/hwasan.exp: New file.
* gcc.dg/hwasan/nested-functions-0.c: New test.
* gcc.dg/hwasan/nested-functions-1.c: New test.
* gcc.dg/hwasan/nested-functions-2.c: New test.
* lib/hwasan-dg.exp: New file.



### Attachment also inlined for ease of reply###


diff --git a/gcc/testsuite/c-c++-common/hwasan/aligned-alloc.c 
b/gcc/testsuite/c-c++-common/hwasan/aligned-alloc.c
new file mode 100644
index 
..d38b1f3f62d97dc3f5c3882137c370700fd4f9a0
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/hwasan/aligned-alloc.c
@@ -0,0 +1,16 @@
+/* { dg-do run } */
+/* { dg-require-effective-target hwaddress_exec } */
+/* { dg-shouldfail "hwasan" } */
+/* This program fails at runtime in the libhwasan library.
+   The allocator can't handle the requested invalid alignment.  */
+
+int
+main ()
+{
+  void *p = __builtin_aligned_alloc (17, 100);
+  if (((unsigned long long)p & 0x10) == 0)
+return 0;
+  return 1;
+}
+
+/* { dg-output "HWAddressSanitizer: invalid alignment requested in 
aligned_alloc: 17" } */
diff --git a/gcc/testsuite/c-c++-common/hwasan/alloca-array-accessible.c 
b/gcc/testsuite/c-c++-common/hwasan/alloca-array-accessible.c
new file mode 100644
index 
..5e4c168f77efac5c9335768fb5b76fb3752783e9
--- /dev/null
+++ b/gcc/testsuite/c-c++-co

[PATCH 3/X] [libsanitizer] Add option to bootstrap using HWASAN

2019-12-12 Thread Matthew Malcomson
Updated to include documentation ChangeLog and apply cleanly on the
bootstrap-asan documentation I mentioned in a different email.



This is an analogous option to --bootstrap-asan to configure.  It allows
bootstrapping GCC using HWASAN.

For the same reasons as for ASAN we have to avoid using the HWASAN
sanitizer when compiling libiberty and the lto-plugin.

Also add a function to query whether -fsanitize=hwaddress has been
passed.

ChangeLog:

2019-08-29  Matthew Malcomson  

* configure: Regenerate.
* configure.ac: Add --bootstrap-hwasan option.

config/ChangeLog:

2019-12-12  Matthew Malcomson  

* bootstrap-hwasan.mk: New file.

gcc/ChangeLog:

2019-12-12  Matthew Malcomson  

* doc/install.texi: Document new option.

libiberty/ChangeLog:

2019-12-12  Matthew Malcomson  

* configure: Regenerate.
* configure.ac: Avoid using sanitizer.

lto-plugin/ChangeLog:

2019-12-12  Matthew Malcomson  

* Makefile.am: Avoid using sanitizer.
* Makefile.in: Regenerate.



### Attachment also inlined for ease of reply###


diff --git a/config/bootstrap-hwasan.mk b/config/bootstrap-hwasan.mk
new file mode 100644
index 
..4f60bed3fd6e98b47a3a38aea6eba2a7c320da25
--- /dev/null
+++ b/config/bootstrap-hwasan.mk
@@ -0,0 +1,8 @@
+# This option enables -fsanitize=hwaddress for stage2 and stage3.
+
+STAGE2_CFLAGS += -fsanitize=hwaddress
+STAGE3_CFLAGS += -fsanitize=hwaddress
+POSTSTAGE1_LDFLAGS += -fsanitize=hwaddress -static-libhwasan \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/ \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/ \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/.libs
diff --git a/configure b/configure
index 
f6027397cedd5c28cd821a0bc7629d6537393ecf..0b502f2784927d72eab248741633305296bf09e7
 100755
--- a/configure
+++ b/configure
@@ -7270,7 +7270,7 @@ fi
 # or bootstrap-ubsan, bootstrap it.
 if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; then
   case "$BUILD_CONFIG" in
-*bootstrap-asan* | *bootstrap-ubsan* )
+*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* )
   bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer,
   bootstrap_fixincludes=yes
   ;;
diff --git a/configure.ac b/configure.ac
index 
50e2fa135b10486170f68ffa162dcbeee2adff90..f793e14e0316e26d05ba72696e8c7c10641c7c96
 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2775,7 +2775,7 @@ fi
 # or bootstrap-ubsan, bootstrap it.
 if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; then
   case "$BUILD_CONFIG" in
-*bootstrap-asan* | *bootstrap-ubsan* )
+*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* )
   bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer,
   bootstrap_fixincludes=yes
   ;;
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 
c5937c94b7cebd3d0f1d7d551a7b593c128e7673..37cb8ed15671205567ee20286e21fb904b3e3c9f
 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2676,6 +2676,12 @@ Some examples of build configurations designed for 
developers of GCC are:
 @item @samp{bootstrap-asan}
 Compiles GCC itself using Address Sanitization in order to catch invalid memory
 accesses within the GCC code.
+
+@item @samp{bootstrap-hwasan}
+Compiles GCC itself using HWAddress Sanitization in order to catch invalid
+memory accesses within the GCC code.  This option is only available on AArch64
+targets with a very recent linux kernel (5.4 or later).
+
 @end table
 
 @section Building a cross compiler
diff --git a/libiberty/configure b/libiberty/configure
index 
7a34dabec32b0b383bd33f07811757335f4dd39c..cb2dd4ff5295598343cc18b3a79a86a778f2261d
 100755
--- a/libiberty/configure
+++ b/libiberty/configure
@@ -5261,6 +5261,7 @@ fi
 NOASANFLAG=
 case " ${CFLAGS} " in
   *\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;;
+  *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;;
 esac
 
 
diff --git a/libiberty/configure.ac b/libiberty/configure.ac
index 
f1ce76010c9acde79c5dc46686a78b2e2f19244e..043237628b79cbf37d07359b59c5ffe17a7a22ef
 100644
--- a/libiberty/configure.ac
+++ b/libiberty/configure.ac
@@ -240,6 +240,7 @@ AC_SUBST(PICFLAG)
 NOASANFLAG=
 case " ${CFLAGS} " in
   *\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;;
+  *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;;
 esac
 AC_SUBST(NOASANFLAG)
 
diff --git a/lto-plugin/Makefile.am b/lto-plugin/Makefile.am
index 
28dc21014b2e86988fa88adabd63ce6092e18e02..34aa397d785e3cc9b6975de460d065900364c3ff
 100644
--- a/lto-plugin/Makefile.am
+++ b/lto-plugin/Makefile.am
@@ -11,8 +11,8 @@ AM_CPPFLAGS = -I$(top_srcdir)/../include $(DEFS)
 AM_CFLAGS = @ac_lto_plugin_warn_cflags@
 AM_LDFLAGS = @ac_lto_plugin_

Document --with-build-config=bootstrap-asan option.

2019-12-12 Thread Matthew Malcomson
Document how to configure using asan (bootstrap-asan option).

Since I'm adding a bootstrap-hwasan option and documenting that, 
bootstrap-asan should also be documented.

(As Martin pointed out in the hwasan reviews).



(FYI I now notice that my hwasan patch 3/X needs this to apply cleanly).



gcc/ChangeLog:

2019-12-12  Matthew Malcomson  

* doc/install.texi: Document bootstrap-asan configuration option.





###


commit 6e0bbe33120ad3f92e4266ecfe4ecb8ce8958865
Author: Matthew Malcomson 
Date:   Wed Nov 6 12:48:08 2019 +

 Document bootstrap-asan compilation option

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 80b4781..c5937c9 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2670,6 +2670,14 @@ the build tree.

  @end table

+Some examples of build configurations designed for developers of GCC are:
+
+@table @asis
+@item @samp{bootstrap-asan}
+Compiles GCC itself using Address Sanitization in order to catch 
invalid memory
+accesses within the GCC code.
+@end table
+
  @section Building a cross compiler

  When building a cross compiler, it is not generally possible to do a



[PATCH 7/X] [libsanitizer] Add tests

2019-12-12 Thread Matthew Malcomson
Adding hwasan tests.

Only interesting thing here is that we have to make sure the tagging mechanism
is deterministic to avoid flaky tests.

gcc/testsuite/ChangeLog:

2019-12-12  Matthew Malcomson  

* c-c++-common/hwasan/aligned-alloc.c: New test.
* c-c++-common/hwasan/alloca-array-accessible.c: New test.
* c-c++-common/hwasan/alloca-gets-different-tag.c: New test.
* c-c++-common/hwasan/alloca-outside-caught.c: New test.
* c-c++-common/hwasan/arguments.c: New test.
* c-c++-common/hwasan/arguments-1.c: New test.
* c-c++-common/hwasan/arguments-2.c: New test.
* c-c++-common/hwasan/arguments-3.c: New test.
* c-c++-common/hwasan/asan-pr63316.c: New test.
* c-c++-common/hwasan/asan-pr70541.c: New test.
* c-c++-common/hwasan/asan-pr78106.c: New test.
* c-c++-common/hwasan/asan-pr79944.c: New test.
* c-c++-common/hwasan/asan-rlimit-mmap-test-1.c: New test.
* c-c++-common/hwasan/bitfield-1.c: New test.
* c-c++-common/hwasan/bitfield-2.c: New test.
* c-c++-common/hwasan/builtin-special-handling.c: New test.
* c-c++-common/hwasan/check-interface.c: New test.
* c-c++-common/hwasan/halt_on_error-1.c: New test.
* c-c++-common/hwasan/heap-overflow.c: New test.
* c-c++-common/hwasan/hwasan-poison-optimisation.c: New test.
* c-c++-common/hwasan/hwasan-thread-access-parent.c: New test.
* c-c++-common/hwasan/hwasan-thread-basic-failure.c: New test.
* c-c++-common/hwasan/hwasan-thread-clears-stack.c: New test.
* c-c++-common/hwasan/hwasan-thread-success.c: New test.
* c-c++-common/hwasan/kernel-defaults.c: New test.
* c-c++-common/hwasan/large-aligned-0.c: New test.
* c-c++-common/hwasan/large-aligned-1.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-0.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-1.c: New test.
* c-c++-common/hwasan/macro-definition.c: New test.
* c-c++-common/hwasan/no-sanitize-attribute.c: New test.
* c-c++-common/hwasan/param-instrument-reads-and-writes.c: New test.
* c-c++-common/hwasan/param-instrument-reads.c: New test.
* c-c++-common/hwasan/param-instrument-writes.c: New test.
* c-c++-common/hwasan/param-memintrin.c: New test.
* c-c++-common/hwasan/random-frame-tag.c: New test.
* c-c++-common/hwasan/sanity-check-pure-c.c: New test.
* c-c++-common/hwasan/setjmp-longjmp-0.c: New test.
* c-c++-common/hwasan/setjmp-longjmp-1.c: New test.
* c-c++-common/hwasan/stack-tagging-basic-0.c: New test.
* c-c++-common/hwasan/stack-tagging-basic-1.c: New test.
* c-c++-common/hwasan/stack-tagging-disable.c: New test.
* c-c++-common/hwasan/unprotected-allocas-0.c: New test.
* c-c++-common/hwasan/unprotected-allocas-1.c: New test.
* c-c++-common/hwasan/use-after-free.c: New test.
* c-c++-common/hwasan/vararray-outside-caught.c: New test.
* c-c++-common/hwasan/vararray-stack-restore-correct.c: New test.
* c-c++-common/hwasan/very-large-objects.c: New test.
* g++.dg/hwasan/hwasan.exp: New file.
* g++.dg/hwasan/rvo-handled.C: New test.
* g++.dg/hwasan/try-catch-0.cpp: New test.
* g++.dg/hwasan/try-catch-1.cpp: New test.
* gcc.dg/hwasan/hwasan.exp: New file.
* gcc.dg/hwasan/nested-functions-0.c: New test.
* gcc.dg/hwasan/nested-functions-1.c: New test.
* gcc.dg/hwasan/nested-functions-2.c: New test.
* lib/hwasan-dg.exp: New file.



### Attachment also inlined for ease of reply###


diff --git a/gcc/testsuite/c-c++-common/hwasan/aligned-alloc.c 
b/gcc/testsuite/c-c++-common/hwasan/aligned-alloc.c
new file mode 100644
index 
..e5837a9f0766fc4d91fabc1a3c2e0017f845dc46
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/hwasan/aligned-alloc.c
@@ -0,0 +1,15 @@
+/* { dg-do run } */
+/* { dg-shouldfail "hwasan" } */
+/* This program fails at runtime in the libhwasan library.
+   The allocator can't handle the requested invalid alignment.  */
+
+int
+main ()
+{
+  void *p = __builtin_aligned_alloc (17, 100);
+  if (((unsigned long long)p & 0x10) == 0)
+return 0;
+  return 1;
+}
+
+/* { dg-output "HWAddressSanitizer: invalid alignment requested in 
aligned_alloc: 17" } */
diff --git a/gcc/testsuite/c-c++-common/hwasan/alloca-array-accessible.c 
b/gcc/testsuite/c-c++-common/hwasan/alloca-array-accessible.c
new file mode 100644
index 
..c6b4d264b8c477b52b859a0411aabb0fc8dde352
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/hwasan/alloca-array-accessible.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+
+#define alloca __builtin_alloca
+
+int __attribute__ ((noinline))
+using_alloca (int num)
+{
+  int retval = 0;
+  int *big_a

[PATCH 6/X] [libsanitizer] Add hwasan pass and associated gimple changes

2019-12-12 Thread Matthew Malcomson
There are four main features to this change:

1) Check pointer tags match address tags.

In the new `hwasan` pass we put HWASAN_CHECK internal functions around
all memory accesses, to check that tags in the pointer being used match
the tag stored in shadow memory for the memory region being used.

These internal functions are expanded into actual checks in the sanopt
pass that happens just before expansion into RTL.

We use the same mechanism that currently inserts ASAN_CHECK internal
functions to insert the new HWASAN_CHECK functions.

2) Instrument known builtin function calls.

Handle all builtin functions that we know use memory accesses.
This commit uses the machinery added for ASAN to identify builtin
functions that access memory.

The main differences between the approaches for HWASAN and ASAN are:
 - libhwasan intercepts much less builtin functions.
 - Alloca needs to be transformed differently (instead of adding
   redzones it needs to tag shadow memory and return a tagged pointer).
 - stack_restore needs to untag the shadow stack between the current
   position and where it's going.
 - `noreturn` functions can not be handled by simply unpoisoning the
   entire shadow stack -- there is no "always valid" tag.
   (exceptions and things such as longjmp need to be handled in a
   different way).

For hardware implemented checking (such as AArch64's memory tagging
extension) alloca and stack_restore will need to be handled by hooks in
the backend rather than transformation at the gimple level.  This will
allow architecture specific handling of such stack modifications.

3) Introduce HWASAN block-scope poisoning

Here we use exactly the same mechanism as ASAN_MARK to poison/unpoison
variables on entry/exit of a block.

In order to simply use the exact same machinery we're using the same
internal functions until the SANOPT pass.  This means that all handling
of ASAN_MARK is the same.
This has the negative that the naming may be a little confusing, but a
positive that handling of the internal function doesn't have to be
duplicated for a function that behaves exactly the same but has a
different name.

gcc/ChangeLog:

2019-12-12  Matthew Malcomson  

* asan.c (handle_builtin_stack_restore): Account for HWASAN.
(handle_builtin_alloca): Account for HWASAN.
(get_mem_refs_of_builtin_call): Special case strlen for HWASAN.
(report_error_func): Assert not HWASAN.
(build_check_stmt): Make HWASAN_CHECK instead of ASAN_CHECK.
(instrument_derefs): HWASAN does not tag globals.
(maybe_instrument_call): Don't instrument `noreturn` functions.
(initialize_sanitizer_builtins): Add new type.
(asan_expand_mark_ifn): Account for HWASAN.
(asan_expand_check_ifn): Assert never called by HWASAN.
(asan_expand_poison_ifn): Account for HWASAN.
(hwasan_instrument): New.
(hwasan_base): New.
(hwasan_emit_untag_frame): Free block-scope-var hash map.
(hwasan_check_func): New.
(hwasan_expand_check_ifn): New.
(hwasan_expand_mark_ifn): New.
(gate_hwasan): New.
(class pass_hwasan): New.
(make_pass_hwasan): New.
(class pass_hwasan_O0): New.
(make_pass_hwasan_O0): New.
* asan.h (hwasan_base): New decl.
(hwasan_expand_check_ifn): New decl.
(hwasan_expand_mark_ifn): New decl.
(gate_hwasan): New decl.
(enum hwasan_mark_flags): New.
(asan_intercepted_p): Always false for hwasan.
(asan_sanitize_use_after_scope): Account for HWASAN.
* builtin-types.def (BT_FN_PTR_CONST_PTR_UINT8): New.
* gimple-pretty-print.c (dump_gimple_call_args): Account for
HWASAN.
* gimplify.c (asan_poison_variable): Account for HWASAN.
(gimplify_function_tree): Remove requirement of
SANITIZE_ADDRESS, requiring asan or hwasan is accounted for in
`asan_sanitize_use_after_scope`.
* internal-fn.c (expand_HWASAN_CHECK): New.
(expand_HWASAN_CHOOSE_TAG): New.
(expand_HWASAN_MARK): New.
(expand_HWASAN_ALLOCA_UNPOISON): New.
* internal-fn.def (HWASAN_CHOOSE_TAG): New.
(HWASAN_CHECK): New.
(HWASAN_MARK): New.
(HWASAN_ALLOCA_UNPOISON): New.
* passes.def: Add hwasan and hwasan_O0 passes.
* sanitizer.def (BUILT_IN_HWASAN_LOAD1): New.
(BUILT_IN_HWASAN_LOAD2): New.
(BUILT_IN_HWASAN_LOAD4): New.
(BUILT_IN_HWASAN_LOAD8): New.
(BUILT_IN_HWASAN_LOAD16): New.
(BUILT_IN_HWASAN_LOADN): New.
(BUILT_IN_HWASAN_STORE1): New.
(BUILT_IN_HWASAN_STORE2): New.
(BUILT_IN_HWASAN_STORE4): New.
(BUILT_IN_HWASAN_STORE8): New.
(BUILT_IN_HWASAN_STORE16): New.
(BUILT_IN_HWASAN_STOREN): New.
(BUILT_IN_HWASAN_LOAD1_NOABORT): New.
(BUILT_IN_HWASAN_LOAD2_NOABORT): New.
(BUILT_IN_HWASAN_LOAD4_NOABORT): New.
(BUILT_IN_HWASAN_LOA

[PATCH 5/X] [libsanitizer][mid-end] Introduce stack variable handling for HWASAN

2019-12-12 Thread Matthew Malcomson
Handling stack variables has three features.

1) Ensure HWASAN required alignment for stack variables

When tagging shadow memory, we need to ensure that each tag granule is
only used by one variable at a time.

This is done by ensuring that each tagged variable is aligned to the tag
granule representation size and also ensure that the end of each
variable as an alignment boundary between the end and the start of any
other data stored on the stack.

This patch ensures that by adding alignment requirements in
`align_local_variable` and forcing all stack variable allocation to be
deferred so that `expand_stack_vars` can ensure the stack pointer is
aligned before allocating any variable for the current frame.

2) Put tags into each stack variable pointer

Make sure that every pointer to a stack variable includes a tag of some
sort on it.

The way tagging works is:
  1) For every new stack frame, a random tag is generated.
  2) A base register is formed from the stack pointer value and this
 random tag.
  3) References to stack variables are now formed with RTL describing an
 offset from this base in both tag and value.

The random tag generation is handled by a backend hook.  This hook
decides whether to introduce a random tag or use the stack background
based on the parameter hwasan-random-frame-tag.  Using the stack
background is necessary for testing and bootstrap.  It is necessary
during bootstrap to avoid breaking the `configure` test program for
determining stack direction.

Using the stack background means that every stack frame has the initial
tag of zero and variables are tagged with incrementing tags from 1,
which also makes debugging a bit easier.

The tag offsets are also handled by a backend hook.

This patch also adds some macros defining how the HWASAN shadow memory
is stored and how a tag is stored in a pointer.

3) For each stack variable, tag and untag the shadow stack on function
   prologue and epilogue.

On entry to each function we tag the relevant shadow stack region for
each stack variable the tag to match the tag added to each pointer for
that variable.

This is the first patch where we use the HWASAN shadow space, so we need
to add in the libhwasan initialisation code that creates this shadow
memory region into the binary we produce.  This instrumentation is done
in `compile_file`.

When exiting a function we need to ensure the shadow stack for this
function has no remaining tag.  Without clearing the shadow stack area
for this stack frame, later function calls could get false positives
when those later function calls check untagged areas (such as parameters
passed on the stack) against a shadow stack area with left-over tag.

Hence we ensure that the entire stack frame is cleared on function exit.

gcc/ChangeLog:

2019-12-12  Matthew Malcomson  

* asan.c (hwasan_record_base): New function.
(hwasan_emit_untag_frame): New.
(hwasan_increment_tag): New function.
(hwasan_with_tag): New function.
(hwasan_tag_init): New function.
(initialize_sanitizer_builtins): Define new builtins.
(ATTR_NOTHROW_LIST): New macro.
(hwasan_current_tag): New.
(hwasan_emit_prologue): New.
(hwasan_create_untagged_base): New.
(hwasan_finish_file): New.
(hwasan_sanitize_stack_p): New.
(hwasan_sanitize_p): New.
* asan.h (hwasan_record_base): New declaration.
(hwasan_emit_untag_frame): New.
(hwasan_increment_tag): New declaration.
(hwasan_with_tag): New declaration.
(hwasan_sanitize_stack_p): New declaration.
(hwasan_tag_init): New declaration.
(hwasan_sanitize_p): New declaration.
(HWASAN_TAG_SIZE): New macro.
(HWASAN_TAG_GRANULE_SIZE):New macro.
(HWASAN_SHIFT):New macro.
(HWASAN_SHIFT_RTX):New macro.
(HWASAN_STACK_BACKGROUND):New macro.
(hwasan_finish_file): New.
(hwasan_current_tag): New.
(hwasan_create_untagged_base): New.
(hwasan_emit_prologue): New.
* cfgexpand.c (struct stack_vars_data): Add information to
record hwasan variable stack offsets.
(expand_stack_vars): Ensure variables are offset from a tagged
base. Record offsets for hwasan. Ensure alignment.
(expand_used_vars): Call function to emit prologue, and get
untagging instructions for function exit.
(align_local_variable): Ensure alignment.
(defer_stack_allocation): Ensure all variables are deferred so
they can be handled by `expand_stack_vars`.
(expand_one_stack_var_at): Account for tags in
variables when using HWASAN.
(expand_one_stack_var_1): Pass new argument to
expand_one_stack_var_at.
(init_vars_expansion): Initialise hwasan internal variables when
starting variable expansion.
* doc/tm.texi (TARGET_MEMTAG_GENTAG): Document.
* doc/tm.texi.in (TARGET_MEMTAG_GENTAG): Document

[PATCH 1/X] [libsanitizer] Tie the hwasan library into our build system

2019-12-12 Thread Matthew Malcomson
This patch tries to tie libhwasan into the GCC build system in the same way
that the other sanitizer runtime libraries are handled.

libsanitizer/ChangeLog:

2019-12-12  Matthew Malcomson  

* Makefile.am:  Build libhwasan.
* Makefile.in:  Build libhwasan.
* asan/Makefile.in:  Build libhwasan.
* configure:  Build libhwasan.
* configure.ac:  Build libhwasan.
* hwasan/Makefile.am: New file.
* hwasan/Makefile.in: New file.
* hwasan/libtool-version: New file.
* interception/Makefile.in: Build libhwasan.
* libbacktrace/Makefile.in: Build libhwasan.
* libsanitizer.spec.in: Build libhwasan.
* lsan/Makefile.in: Build libhwasan.
* merge.sh: Build libhwasan.
* sanitizer_common/Makefile.in: Build libhwasan.
* tsan/Makefile.in: Build libhwasan.
* ubsan/Makefile.in: Build libhwasan.



### Attachment also inlined for ease of reply###


diff --git a/libsanitizer/Makefile.am b/libsanitizer/Makefile.am
index 
65ed1e712378ef453f820f86c4d3221f9dee5f2c..2a7e8e1debe838719db0f0fad218b2543cc3111b
 100644
--- a/libsanitizer/Makefile.am
+++ b/libsanitizer/Makefile.am
@@ -14,11 +14,12 @@ endif
 if LIBBACKTRACE_SUPPORTED
 SUBDIRS += libbacktrace
 endif
-SUBDIRS += lsan asan ubsan
+SUBDIRS += lsan asan ubsan hwasan
 nodist_saninclude_HEADERS += \
   include/sanitizer/lsan_interface.h \
   include/sanitizer/asan_interface.h \
-  include/sanitizer/tsan_interface.h
+  include/sanitizer/tsan_interface.h \
+  include/sanitizer/hwasan_interface.h
 if TSAN_SUPPORTED
 SUBDIRS += tsan
 endif
diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in
index 
0d789b3a59d21ea2e5a23057ca3afe15425feec4..36aa952af7e04bc0e4fb94cdcd584d539193d781
 100644
--- a/libsanitizer/Makefile.in
+++ b/libsanitizer/Makefile.in
@@ -92,7 +92,8 @@ target_triplet = @target@
 @SANITIZER_SUPPORTED_TRUE@am__append_1 = 
include/sanitizer/common_interface_defs.h \
 @SANITIZER_SUPPORTED_TRUE@ include/sanitizer/lsan_interface.h \
 @SANITIZER_SUPPORTED_TRUE@ include/sanitizer/asan_interface.h \
-@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/tsan_interface.h
+@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/tsan_interface.h \
+@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/hwasan_interface.h
 @SANITIZER_SUPPORTED_TRUE@@USING_MAC_INTERPOSE_FALSE@am__append_2 = 
interception
 @LIBBACKTRACE_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_3 = 
libbacktrace
 @SANITIZER_SUPPORTED_TRUE@@TSAN_SUPPORTED_TRUE@am__append_4 = tsan
@@ -206,7 +207,7 @@ ETAGS = etags
 CTAGS = ctags
 CSCOPE = cscope
 DIST_SUBDIRS = sanitizer_common interception libbacktrace lsan asan \
-   ubsan tsan
+   ubsan hwasan tsan
 ACLOCAL = @ACLOCAL@
 ALLOC_FILE = @ALLOC_FILE@
 AMTAR = @AMTAR@
@@ -328,6 +329,7 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libasan = @link_libasan@
+link_libhwasan = @link_libhwasan@
 link_liblsan = @link_liblsan@
 link_libtsan = @link_libtsan@
 link_libubsan = @link_libubsan@
@@ -361,7 +363,7 @@ sanincludedir = 
$(libdir)/gcc/$(target_alias)/$(gcc_version)/include/sanitizer
 nodist_saninclude_HEADERS = $(am__append_1)
 @SANITIZER_SUPPORTED_TRUE@SUBDIRS = sanitizer_common $(am__append_2) \
 @SANITIZER_SUPPORTED_TRUE@ $(am__append_3) lsan asan ubsan \
-@SANITIZER_SUPPORTED_TRUE@ $(am__append_4)
+@SANITIZER_SUPPORTED_TRUE@ hwasan $(am__append_4)
 gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
diff --git a/libsanitizer/asan/Makefile.in b/libsanitizer/asan/Makefile.in
index 
00b6082da5372efd679ddc230f588bbc58161ef6..76689c3b224b1fb04895ae48829eac4b6784cd84
 100644
--- a/libsanitizer/asan/Makefile.in
+++ b/libsanitizer/asan/Makefile.in
@@ -382,6 +382,7 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libasan = @link_libasan@
+link_libhwasan = @link_libhwasan@
 link_liblsan = @link_liblsan@
 link_libtsan = @link_libtsan@
 link_libubsan = @link_libubsan@
diff --git a/libsanitizer/configure b/libsanitizer/configure
index 
79b5c1eadb59018bca13a33f19f3494c170365ee..ff72af73e6f77aaf93bf39e6799f896851a377dd
 100755
--- a/libsanitizer/configure
+++ b/libsanitizer/configure
@@ -657,6 +657,7 @@ USING_MAC_INTERPOSE_TRUE
 link_liblsan
 link_libubsan
 link_libtsan
+link_libhwasan
 link_libasan
 LSAN_SUPPORTED_FALSE
 LSAN_SUPPORTED_TRUE
@@ -12334,7 +12335,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12337 "configure"
+#line 12338 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -12440,7 +12441,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12443 "configure"
+#line 12444 "configure"
 #

[Patch 0/X] HWASAN v3

2019-12-12 Thread Matthew Malcomson
Hello,

I've gone through the suggestions Martin made and implemented  the ones I think
I can implement for GCC10.

The two functionality changes in this version are:
Added the --param's hwasan-instrument-reads, hwasan-instrument-writes,
hwasan-instrument-allocas, hwasan-memintrin, options.  I.e. Those that asan has
and that make sense for hwasan.

Avoided HWASAN_STACK_BACKGROUND in hwasan_increment_tag when using a
deterministic tagging approach.


There are a lot of extra comments and tests.


Bootstrapped and regtested on x86_64 and AArch64.
Bootstrapped with `--with-build-config=bootstrap-hwasan` on AArch64 and hwasan
features tested there.
Built the linux kernel using this feature and ran the test_kasan.ko testing to
check the this works for the kernel.
(NOTE: I actually did all the above testing before a search and replace of
`memory_tagging_p` for `hwasan_sanitize_p` and fixing a typo in the
`hwasan-instrument-allocas` parameter name, I will run all the tests again
before committing but figure I'll send this out now since I fully expect the
tests to still pass).


I noticed one extra testsuite failure from those mentioned in the previous
version emails: g++.dg/cpp2a/ucn2.C.
I believe this is HWASAN correctly catching a problem in the compiler.
I've logged the issue here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92919 .


I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory tagging,
since I'm not sure the way I found to implement this would be acceptable.  The
inlined patch below works but it requires a special declaration instead of just
an ~#include~.


diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index a1bc081..d81eb12 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -101,10 +101,16 @@ extern void init_internal_fns ();
 
 extern const char *const internal_fn_name_array[];
 
+
+extern bool hwasan_sanitize_p (void);
 static inline const char *
 internal_fn_name (enum internal_fn fn)
 {
-  return internal_fn_name_array[(int) fn];
+  const char *ret = internal_fn_name_array[(int) fn];
+  if (! strcmp (ret, "ASAN_MARK")
+  && hwasan_sanitize_p ())
+return "HWASAN_MARK";
+  return ret;
 }
 
 extern internal_fn lookup_internal_fn (const char *);


Entire patch series attached to cover letter.

all-patches.tar.gz
Description: all-patches.tar.gz


[PATCH 3/X] [libsanitizer] Add option to bootstrap using HWASAN

2019-12-12 Thread Matthew Malcomson
This is an analogous option to --bootstrap-asan to configure.  It allows
bootstrapping GCC using HWASAN.

For the same reasons as for ASAN we have to avoid using the HWASAN
sanitizer when compiling libiberty and the lto-plugin.

Also add a function to query whether -fsanitize=hwaddress has been
passed.

ChangeLog:

2019-08-29  Matthew Malcomson  

* configure: Regenerate.
* configure.ac: Add --bootstrap-hwasan option.

config/ChangeLog:

2019-12-12  Matthew Malcomson  

* bootstrap-hwasan.mk: New file.

libiberty/ChangeLog:

2019-12-12  Matthew Malcomson  

* configure: Regenerate.
* configure.ac: Avoid using sanitizer.

lto-plugin/ChangeLog:

2019-12-12  Matthew Malcomson  

* Makefile.am: Avoid using sanitizer.
* Makefile.in: Regenerate.



### Attachment also inlined for ease of reply###


diff --git a/config/bootstrap-hwasan.mk b/config/bootstrap-hwasan.mk
new file mode 100644
index 
..4f60bed3fd6e98b47a3a38aea6eba2a7c320da25
--- /dev/null
+++ b/config/bootstrap-hwasan.mk
@@ -0,0 +1,8 @@
+# This option enables -fsanitize=hwaddress for stage2 and stage3.
+
+STAGE2_CFLAGS += -fsanitize=hwaddress
+STAGE3_CFLAGS += -fsanitize=hwaddress
+POSTSTAGE1_LDFLAGS += -fsanitize=hwaddress -static-libhwasan \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/ \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/ \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/.libs
diff --git a/configure b/configure
index 
aec9186b2b0123d3088b69eb1ee541567654953e..6f71b111bd18ec053180beecf83dd4549e83c2b9
 100755
--- a/configure
+++ b/configure
@@ -7270,7 +7270,7 @@ fi
 # or bootstrap-ubsan, bootstrap it.
 if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; then
   case "$BUILD_CONFIG" in
-*bootstrap-asan* | *bootstrap-ubsan* )
+*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* )
   bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer,
   bootstrap_fixincludes=yes
   ;;
diff --git a/configure.ac b/configure.ac
index 
b8ce2ad20b9d03e42731252a9ec2a8417c13e566..16bfdf164555dad94c789f17b6a63ba1a2e3e9f4
 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2775,7 +2775,7 @@ fi
 # or bootstrap-ubsan, bootstrap it.
 if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; then
   case "$BUILD_CONFIG" in
-*bootstrap-asan* | *bootstrap-ubsan* )
+*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* )
   bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer,
   bootstrap_fixincludes=yes
   ;;
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 
6c9579bfaff955eb43875b404fb7db1a667bf522..da9a8809c3440827ac22ef6936e080820197f4e7
 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2645,6 +2645,13 @@ Some examples of build configurations designed for 
developers of GCC are:
 Compiles GCC itself using Address Sanitization in order to catch invalid memory
 accesses within the GCC code.
 
+@item @samp{bootstrap-hwasan}
+Compiles GCC itself using HWAddress Sanitization in order to catch invalid
+memory accesses within the GCC code.  This option is only available on AArch64
+targets with a very recent linux kernel (5.4 or later).
+
+@end table
+
 @section Building a cross compiler
 
 When building a cross compiler, it is not generally possible to do a
diff --git a/libiberty/configure b/libiberty/configure
index 
7a34dabec32b0b383bd33f07811757335f4dd39c..cb2dd4ff5295598343cc18b3a79a86a778f2261d
 100755
--- a/libiberty/configure
+++ b/libiberty/configure
@@ -5261,6 +5261,7 @@ fi
 NOASANFLAG=
 case " ${CFLAGS} " in
   *\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;;
+  *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;;
 esac
 
 
diff --git a/libiberty/configure.ac b/libiberty/configure.ac
index 
f1ce76010c9acde79c5dc46686a78b2e2f19244e..043237628b79cbf37d07359b59c5ffe17a7a22ef
 100644
--- a/libiberty/configure.ac
+++ b/libiberty/configure.ac
@@ -240,6 +240,7 @@ AC_SUBST(PICFLAG)
 NOASANFLAG=
 case " ${CFLAGS} " in
   *\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;;
+  *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;;
 esac
 AC_SUBST(NOASANFLAG)
 
diff --git a/lto-plugin/Makefile.am b/lto-plugin/Makefile.am
index 
28dc21014b2e86988fa88adabd63ce6092e18e02..34aa397d785e3cc9b6975de460d065900364c3ff
 100644
--- a/lto-plugin/Makefile.am
+++ b/lto-plugin/Makefile.am
@@ -11,8 +11,8 @@ AM_CPPFLAGS = -I$(top_srcdir)/../include $(DEFS)
 AM_CFLAGS = @ac_lto_plugin_warn_cflags@
 AM_LDFLAGS = @ac_lto_plugin_ldflags@
 AM_LIBTOOLFLAGS = --tag=disable-static
-override CFLAGS := $(filter-out -fsanitize=address,$(CFLAGS))
-override LDFLAGS := $(filter-out -fsanitize=address,$(LDFLAGS))
+override CFLAGS :

[PATCH 4/X] [libsanitizer][options] Add hwasan flags and argument parsing

2019-12-12 Thread Matthew Malcomson
These flags can't be used at the same time as any of the other
sanitizers.
We add an equivalent flag to -static-libasan in -static-libhwasan to
ensure static linking.

The -fsanitize=kernel-hwaddress option is for compiling targeting the
kernel.  This flag has defaults that allow compiling KASAN with tags as
it is currently implemented.
These defaults are that we do not sanitize variables on the stack and
always recover from a detected bug.
Stack tagging in the kernel is a future aim, stack instrumentation has
not yet been enabled for the kernel for clang either
(https://lists.infradead.org/pipermail/linux-arm-kernel/2019-October/687121.html).

We introduce a backend hook `targetm.memtag.can_tag_addresses` that
indicates to the mid-end whether a target has a feature like AArch64 TBI
where the top byte of an address is ignored.
Without this feature hwasan sanitization is not done.

gcc/ChangeLog:

2019-12-12  Matthew Malcomson  

* common.opt (flag_sanitize_recover): Default for kernel
hwaddress.
(static-libhwasan): New cli option.
* config/aarch64/aarch64.c (aarch64_can_tag_addresses): New.
(TARGET_MEMTAG_CAN_TAG_ADDRESSES): New.
* config/gnu-user.h (LIBHWASAN_EARLY_SPEC): hwasan equivalent of
asan command line flags.
* cppbuiltin.c (define_builtin_macros_for_compilation_flags):
Add hwasan equivalent of __SANITIZE_ADDRESS__.
* doc/tm.texi: Document new hook.
* doc/tm.texi.in: Document new hook.
* flag-types.h (enum sanitize_code): New sanitizer values.
* gcc.c (STATIC_LIBHWASAN_LIBS): New macro.
(LIBHWASAN_SPEC): New macro.
(LIBHWASAN_EARLY_SPEC): New macro.
(SANITIZER_EARLY_SPEC): Update to include hwasan.
(SANITIZER_SPEC): Update to include hwasan.
(sanitize_spec_function): Use hwasan options.
* opts.c (finish_options): Describe conflicts between address
sanitizers.
(sanitizer_opts): Introduce new sanitizer flags.
(common_handle_option): Add defaults for kernel sanitizer.
* params.opt (hwasan-stack): New
(hwasan-random-frame-tag): New
(hwasan-instrument-allocas): New
(hwasan-instrument-reads): New
(hwasan-instrument-writes): New
(hwasan-memintrin): New
* target.def (HOOK_PREFIX): Add new hook.
* targhooks.c (default_memtag_can_tag_addresses): New.
* toplev.c (process_options): Ensure hwasan only on TBI
architectures.

gcc/c-family/ChangeLog:

2019-12-12  Matthew Malcomson  

* c-attribs.c (handle_no_sanitize_hwaddress_attribute): New
attribute.



### Attachment also inlined for ease of reply###


diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 
dc56e2ec62ffc7f494cc59ddf02452ac0cb406de..81f8dfc6dd3f54067d7cc0def3c0babc03bcd9c4
 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -54,6 +54,8 @@ static tree handle_cold_attribute (tree *, tree, tree, int, 
bool *);
 static tree handle_no_sanitize_attribute (tree *, tree, tree, int, bool *);
 static tree handle_no_sanitize_address_attribute (tree *, tree, tree,
  int, bool *);
+static tree handle_no_sanitize_hwaddress_attribute (tree *, tree, tree,
+   int, bool *);
 static tree handle_no_sanitize_thread_attribute (tree *, tree, tree,
 int, bool *);
 static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree,
@@ -412,6 +414,8 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_no_sanitize_attribute, NULL },
   { "no_sanitize_address",0, 0, true, false, false, false,
  handle_no_sanitize_address_attribute, NULL },
+  { "no_sanitize_hwaddress",0, 0, true, false, false, false,
+ handle_no_sanitize_hwaddress_attribute, NULL },
   { "no_sanitize_thread", 0, 0, true, false, false, false,
  handle_no_sanitize_thread_attribute, NULL },
   { "no_sanitize_undefined",  0, 0, true, false, false, false,
@@ -946,6 +950,22 @@ handle_no_sanitize_address_attribute (tree *node, tree 
name, tree, int,
   return NULL_TREE;
 }
 
+/* Handle a "no_sanitize_hwaddress" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_no_sanitize_hwaddress_attribute (tree *node, tree name, tree, int,
+ bool *no_add_attrs)
+{
+  *no_add_attrs = true;
+  if (TREE_CODE (*node) != FUNCTION_DECL)
+warning (OPT_Wattributes, "%qE attribute ignored", name);
+  else
+add_no_sanitize_value (*node, SANITIZE_HWADDRESS);
+
+  return NULL_TREE;
+}
+
 /* Handle a "no_sanitize_thread" attribute; arguments as in
struct attr

[PATCH 2/X] [libsanitizer] Only build libhwasan when targeting AArch64

2019-12-12 Thread Matthew Malcomson
Though the library has limited support for x86, we don't have any
support for generating code targeting x86 so there is no point building
for that target.

libsanitizer/ChangeLog:

2019-12-12  Matthew Malcomson  

* Makefile.am: Condition building hwasan directory.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac: Set HWASAN_SUPPORTED based on target
architecture.
* configure.tgt: Likewise.



### Attachment also inlined for ease of reply###


diff --git a/libsanitizer/Makefile.am b/libsanitizer/Makefile.am
index 
2a7e8e1debe838719db0f0fad218b2543cc3111b..065a65e78d49f7689a01ecb64db1f07ca83aa987
 100644
--- a/libsanitizer/Makefile.am
+++ b/libsanitizer/Makefile.am
@@ -14,7 +14,7 @@ endif
 if LIBBACKTRACE_SUPPORTED
 SUBDIRS += libbacktrace
 endif
-SUBDIRS += lsan asan ubsan hwasan
+SUBDIRS += lsan asan ubsan
 nodist_saninclude_HEADERS += \
   include/sanitizer/lsan_interface.h \
   include/sanitizer/asan_interface.h \
@@ -23,6 +23,9 @@ nodist_saninclude_HEADERS += \
 if TSAN_SUPPORTED
 SUBDIRS += tsan
 endif
+if HWASAN_SUPPORTED
+SUBDIRS += hwasan
+endif
 endif
 
 ## May be used by toolexeclibdir.
diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in
index 
36aa952af7e04bc0e4fb94cdcd584d539193d781..75a99491cb1d4422fd5e2d93cae93eb883ae0963
 100644
--- a/libsanitizer/Makefile.in
+++ b/libsanitizer/Makefile.in
@@ -97,6 +97,7 @@ target_triplet = @target@
 @SANITIZER_SUPPORTED_TRUE@@USING_MAC_INTERPOSE_FALSE@am__append_2 = 
interception
 @LIBBACKTRACE_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_3 = 
libbacktrace
 @SANITIZER_SUPPORTED_TRUE@@TSAN_SUPPORTED_TRUE@am__append_4 = tsan
+@HWASAN_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_5 = hwasan
 subdir = .
 ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
 am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \
@@ -207,7 +208,7 @@ ETAGS = etags
 CTAGS = ctags
 CSCOPE = cscope
 DIST_SUBDIRS = sanitizer_common interception libbacktrace lsan asan \
-   ubsan hwasan tsan
+   ubsan tsan hwasan
 ACLOCAL = @ACLOCAL@
 ALLOC_FILE = @ALLOC_FILE@
 AMTAR = @AMTAR@
@@ -363,7 +364,7 @@ sanincludedir = 
$(libdir)/gcc/$(target_alias)/$(gcc_version)/include/sanitizer
 nodist_saninclude_HEADERS = $(am__append_1)
 @SANITIZER_SUPPORTED_TRUE@SUBDIRS = sanitizer_common $(am__append_2) \
 @SANITIZER_SUPPORTED_TRUE@ $(am__append_3) lsan asan ubsan \
-@SANITIZER_SUPPORTED_TRUE@ hwasan $(am__append_4)
+@SANITIZER_SUPPORTED_TRUE@ $(am__append_4) $(am__append_5)
 gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
diff --git a/libsanitizer/configure b/libsanitizer/configure
index 
ff72af73e6f77aaf93bf39e6799f896851a377dd..4e95194fe3567b1227c4036c2f5bf6540f735975
 100755
--- a/libsanitizer/configure
+++ b/libsanitizer/configure
@@ -659,6 +659,8 @@ link_libubsan
 link_libtsan
 link_libhwasan
 link_libasan
+HWASAN_SUPPORTED_FALSE
+HWASAN_SUPPORTED_TRUE
 LSAN_SUPPORTED_FALSE
 LSAN_SUPPORTED_TRUE
 TSAN_SUPPORTED_FALSE
@@ -12335,7 +12337,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12338 "configure"
+#line 12340 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -12441,7 +12443,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12444 "configure"
+#line 12446 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -15792,6 +15794,7 @@ fi
 # Get target configury.
 unset TSAN_SUPPORTED
 unset LSAN_SUPPORTED
+unset HWASAN_SUPPORTED
 . ${srcdir}/configure.tgt
  if test "x$TSAN_SUPPORTED" = "xyes"; then
   TSAN_SUPPORTED_TRUE=
@@ -15809,6 +15812,14 @@ else
   LSAN_SUPPORTED_FALSE=
 fi
 
+ if test "x$HWASAN_SUPPORTED" = "xyes"; then
+  HWASAN_SUPPORTED_TRUE=
+  HWASAN_SUPPORTED_FALSE='#'
+else
+  HWASAN_SUPPORTED_TRUE='#'
+  HWASAN_SUPPORTED_FALSE=
+fi
+
 
 # Check for functions needed.
 for ac_func in clock_getres clock_gettime clock_settime lstat readlink
@@ -16791,7 +16802,7 @@ ac_config_files="$ac_config_files Makefile 
libsanitizer.spec libbacktrace/backtr
 ac_config_headers="$ac_config_headers config.h"
 
 
-ac_config_files="$ac_config_files interception/Makefile 
sanitizer_common/Makefile libbacktrace/Makefile lsan/Makefile asan/Makefile 
hwasan/Makefile ubsan/Makefile"
+ac_config_files="$ac_config_files interception/Makefile 
sanitizer_common/Makefile libbacktrace/Makefile lsan/Makefile asan/Makefile 
ubsan/Makefile"
 
 
 if test "x$TSAN_SUPPORTED" = "xyes"; then
@@ -16799,6 +16810,11 @@ if test "x$TSAN_SUPPORTED" = "xyes"; then
 
 fi
 
+if test "x$HWASAN_SUPPORTED" = "xyes&qu

Re: [PATCH] bring -Warray-bounds closer to -Wstringop-overflow (PR91647, 91463, 91679)

2019-12-09 Thread Matthew Malcomson
On 01/11/2019 21:09, Martin Sebor wrote:
> diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c
> index 53278168a59..d7c74a1865a 100644
> --- a/gcc/gimple-match-head.c
> +++ b/gcc/gimple-match-head.c
> @@ -837,8 +837,8 @@ try_conditional_simplification (internal_fn ifn, 
> gimple_match_op *res_op,
> gimple_match_op cond_op (gimple_match_cond (res_op->ops[0],
> res_op->ops[num_ops - 1]),
>  op, res_op->type, num_ops - 2);
> -  for (unsigned int i = 1; i < num_ops - 1; ++i)
> -cond_op.ops[i - 1] = res_op->ops[i];
> +
> +  memcpy (cond_op.ops, res_op->ops + 1, (num_ops - 1) * sizeof *cond_op.ops);
> switch (num_ops - 2)
>   {
>   case 2:

I think this copies one extra element than the original code.

(copying `num_ops - 1` elements, while the previous loop only copied 
`num_ops - 2` elements since the counter started at 1).


Re: [mid-end] Add notes to dataflow insn info when re-emitting (PR92410)

2019-12-09 Thread Matthew Malcomson
Ah,  apologies -- you're right.
I'd already committed the patch this morning, so I'll update it with the 
obvious fix.

Thanks for the catch,
Matthew

On 09/12/2019 12:48, Martin Liška wrote:
> Hello.
> 
> The patch triggers the following warning:
> 
> In file included from /home/marxin/Programming/gcc/gcc/regstat.c:23:
> /home/marxin/Programming/gcc/gcc/regstat.c: In function ‘void 
> regstat_bb_compute_calls_crossed(unsigned int, bitmap)’:
> /home/marxin/Programming/gcc/gcc/regstat.c:327:35: warning: comparison 
> of integer expressions of different signedness: ‘int’ and ‘unsigned int’ 
> [-Wsign-compare]
>    327 |   gcc_assert (INSN_UID (insn) < DF_INSN_SIZE ());
> /home/marxin/Programming/gcc/gcc/system.h:748:14: note: in definition of 
> macro ‘gcc_assert’
>    748 |    ((void)(!(EXPR) ? fancy_abort (__FILE__, __LINE__, 
> __FUNCTION__), 0 : 0))
>    |  ^~~~
> 
> What about something like:
> 
>   gcc/regstat.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/regstat.c b/gcc/regstat.c
> index c6cefb117d7..035d48c28ab 100644
> --- a/gcc/regstat.c
> +++ b/gcc/regstat.c
> @@ -324,7 +324,7 @@ regstat_bb_compute_calls_crossed (unsigned int 
> bb_index, bitmap live)
> 
>     FOR_BB_INSNS_REVERSE (bb, insn)
>   {
> -  gcc_assert (INSN_UID (insn) < DF_INSN_SIZE ());
> +  gcc_assert (INSN_UID (insn) < (int)DF_INSN_SIZE ());
>     struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
>     unsigned int regno;
> 
> Martin



Re: [mid-end] Add notes to dataflow insn info when re-emitting (PR92410)

2019-12-09 Thread Matthew Malcomson
Ah,  apologies -- you're right.
I'd already committed the patch this morning, so I'll update it with the 
obvious fix.

Thanks for the catch,
Matthew

On 09/12/2019 12:48, Martin Liška wrote:
> Hello.
> 
> The patch triggers the following warning:
> 
> In file included from /home/marxin/Programming/gcc/gcc/regstat.c:23:
> /home/marxin/Programming/gcc/gcc/regstat.c: In function ‘void 
> regstat_bb_compute_calls_crossed(unsigned int, bitmap)’:
> /home/marxin/Programming/gcc/gcc/regstat.c:327:35: warning: comparison 
> of integer expressions of different signedness: ‘int’ and ‘unsigned int’ 
> [-Wsign-compare]
>    327 |   gcc_assert (INSN_UID (insn) < DF_INSN_SIZE ());
> /home/marxin/Programming/gcc/gcc/system.h:748:14: note: in definition of 
> macro ‘gcc_assert’
>    748 |    ((void)(!(EXPR) ? fancy_abort (__FILE__, __LINE__, 
> __FUNCTION__), 0 : 0))
>    |  ^~~~
> 
> What about something like:
> 
>   gcc/regstat.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/regstat.c b/gcc/regstat.c
> index c6cefb117d7..035d48c28ab 100644
> --- a/gcc/regstat.c
> +++ b/gcc/regstat.c
> @@ -324,7 +324,7 @@ regstat_bb_compute_calls_crossed (unsigned int 
> bb_index, bitmap live)
> 
>     FOR_BB_INSNS_REVERSE (bb, insn)
>   {
> -  gcc_assert (INSN_UID (insn) < DF_INSN_SIZE ());
> +  gcc_assert (INSN_UID (insn) < (int)DF_INSN_SIZE ());
>     struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
>     unsigned int regno;
> 
> Martin



Re: [mid-end] Add notes to dataflow insn info when re-emitting (PR92410)

2019-12-02 Thread Matthew Malcomson
Ping

On 12/11/2019 09:11, Matthew Malcomson wrote:
> In scheduling passes, notes are removed with `remove_notes` before the
> scheduling is done, and added back in with `reemit_notes` once the
> scheduling has been decided.
> 
> This process leaves the notes in the RTL chain with different insn uid's
> than were there before.  Having different UID's (larger than the
> previous ones) means that DF_INSN_INFO_GET(insn) will access outside of
> the allocated array.
> 
> This has been seen in the `regstat_bb_compute_calls_crossed` function.
> This patch adds an assert to the `regstat_bb_compute_calls_crossed`
> function so that bad accesses here are caught instead of going
> unnoticed, and then avoids the problem.
> 
> We avoid the problem by ensuring that new notes added by `reemit_notes` have 
> an
> insn record given to them.  This is done by adding a call to
> `df_insn_create_insn_record` on each note added in `reemit_notes`.
> `df_insn_create_insn_record` leaves this new record zeroed out, which appears
> to be fine for notes (e.g. `df_bb_refs_record` already does not set
> anything except the luid for notes, and notes have no dataflow information to
> record).
> 
> We add the testcase that Martin found here
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92410#c2 .
> This testcase fails with the "regstat.c" change, and then succeeds with the
> "haifa-sched.c" change.
> 
> There is a similar problem with labels, that the `gcc_assert` catches
> when running regression tests in gcc.dg/fold-eqandshift-1.c and
> gcc.c-torture/compile/pr32482.c.
> This is due to the `cfg_layout_finalize` call in `bb-reorder.c` emitting
> new labels for the start of the newly created basic blocks. These labels are
> not given a dataflow df_insn_info member.
> 
> We solve this by manually calling `df_recompute_luids` on each basic
> block once this pass has finished.
> 
> Testing done:
>Bootstrapped and regression test on aarch64-none-linux-gnu native.
> 
> gcc/ChangeLog:
> 
> 2019-11-12  Matthew Malcomson  
> 
>   PR middle-end/92410
>   * bb-reorder.c (pass_reorder_blocks::execute): Recompute
>   dataflow luids once basic blocks have been reordered.
>   * haifa-sched.c (reemit_notes): Create df insn record for each
>   new note.
>   * regstat.c (regstat_bb_compute_calls_crossed): Assert every
>   insn has an insn record before trying to use it.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-11-12  Matthew Malcomson  
> 
>   PR middle-end/92410
>   * gcc.dg/torture/pr92410.c: New test.
> 
> 
> 
> ### Attachment also inlined for ease of reply
> ###
> 
> 
> diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c
> index 
> 0ac39140c6ce3db8499f99cd8f48321de61b..4943f97a2cfccc7ab7c4834fff2c81f62fc5b738
>  100644
> --- a/gcc/bb-reorder.c
> +++ b/gcc/bb-reorder.c
> @@ -2662,6 +2662,8 @@ pass_reorder_blocks::execute (function *fun)
> bb->aux = bb->next_bb;
> cfg_layout_finalize ();
>   
> +  FOR_EACH_BB_FN (bb, fun)
> +df_recompute_luids (bb);
> return 0;
>   }
>   
> diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
> index 
> 41cf1f362e8c34d009b0a310ff5b9a9ffb613631..2e1a84f1325d54ece1cf3c6f0bab607099c71d27
>  100644
> --- a/gcc/haifa-sched.c
> +++ b/gcc/haifa-sched.c
> @@ -5433,6 +5433,7 @@ reemit_notes (rtx_insn *insn)
>   
> last = emit_note_before (note_type, last);
> remove_note (insn, note);
> +   df_insn_create_insn_record (last);
>   }
>   }
>   }
> diff --git a/gcc/regstat.c b/gcc/regstat.c
> index 
> 4da9b7cc523cde113df931226ea56310b659e619..c6cefb117d7330cc1b9787c37083e05e2acda2d2
>  100644
> --- a/gcc/regstat.c
> +++ b/gcc/regstat.c
> @@ -324,6 +324,7 @@ regstat_bb_compute_calls_crossed (unsigned int bb_index, 
> bitmap live)
>   
> FOR_BB_INSNS_REVERSE (bb, insn)
>   {
> +  gcc_assert (INSN_UID (insn) < DF_INSN_SIZE ());
> struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
> unsigned int regno;
>   
> diff --git a/gcc/testsuite/gcc.dg/torture/pr92410.c 
> b/gcc/testsuite/gcc.dg/torture/pr92410.c
> new file mode 100644
> index 
> ..628e329782260ef36f26506bfbc0d36a93b33f41
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr92410.c
> @@ -0,0 +1,8 @@
> +/* PR middle-end/92410  */
> +/* { dg-do compile } */
> +int v;
> +
> +int a() {
> +  ;
> +  return v;
> +}
> 



Re: v2 [PATCH 0/X] Introduce HWASAN sanitizer to GCC

2019-11-21 Thread Matthew Malcomson
On 21/11/2019 13:10, Martin Liška wrote:
> On 11/20/19 6:40 PM, Matthew Malcomson wrote:
>> On 20/11/2019 14:29, Martin Liška wrote:
>>> On 11/7/19 7:37 PM, Matthew Malcomson wrote:
>>>> I have rebased this series onto Martin Liska's patches that take the 
>>>> most
>>>> recent libhwasan from upstream LLVM.
>>>> https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00340.html
>>> a) For ASAN we do have bunch of parameters:
>>>
>>> gcc --help=param | grep asan
>>>     asan-stack  Enable asan stack protection.
>>>     asan-instrument-allocas Enable asan allocas/VLAs protection.
>>>     asan-globals    Enable asan globals protection.
>>>     asan-instrument-writes  Enable asan store operations protection.
>>>     asan-instrument-reads   Enable asan load operations protection.
>>>     asan-memintrin  Enable asan builtin functions 
>>> protection.
>>>     asan-use-after-return   Enable asan detection of 
>>> use-after-return
>>> bugs.
>>>     asan-instrumentation-with-call-threshold Use callbacks instead of
>>> inline code if number of accesses in function becomes greater or equal
>>> to this number.
>>>
>>> We can probably use some of these in order to drive hwaddress in a
>>> similar way. Most of them makes sense for hwaddress as well
>>
>>
>> I would prefer to keep the options different but would be happy to share
>> them if others want.
> 
> Works for me.
> 
>>
>> I figure that there will be some parameters that make sense for hwasan
>> but not for asan.
>> For example hwasan-random-frame-tag doesn't have any analogue for asan.
>> Re-using `asan-stack` for hwasan would mean we would then need to decide
>> between having options with either `hwasan` or `asan` prefix being able
>> to affect hwasan, or introducing a parameter `asan-random-frame-tag`
>> that has no meaning on asan.
> 
> Then, I would come up with additional hwasan-* parameters that can have
> the same semantic as the for asan (if it makes sense).
> 

Do you mean that I should add the options and the ability to toggle 
these features?

The options that could eventually make sense for hwasan and that I 
haven't implemented are:
   hwasan-instrument-allocas
   hwasan-instrument-writes
   hwasan-instrument-reads
   hwasan-memintrin

hwasan-globals could eventually be done, but that's certainly not 
something I could implement quickly.

>>
>>
>>>
>>> b) Apparently clang prints also value of all registers. Do we want to do
>>> the same?
>>>
>>
>> I believe this only happens on clang for inline checking (try testing
>> clang using the parameter `-mllvm --hwasan-instrument-with-calls` to see
>> without).
> 
> Are we talking about usage of __hwasan_check_x0_18?
> 

Using `-mllvm --hwasan-instrument-with-calls` uses function calls like 
`__hwasan_store4` (similar to the asan instrumentation with calls.


The __hwasan_check_x* functions are the "generated checking functions 
with special calling conventions" I mentioned.

The special checking functions are generated to accept a calling 
convention generated based on what would be optimal for where a check is 
needed.  This custom calling convention can then also be used to keep 
track of what register values were around in the calling frame.

This was introduced in clang at revision:   llvm-svn: 351920

https://reviews.llvm.org/D56954

>>
>> This would be nice to have, but I'm not planning on looking at it any
>> time soon.
>> This is currently implemented in clang on top of two not-yet implemented
>> features: Inline instrumentation, and generated checking functions with
>> special calling conventions.
>>
>> It's certainly not something I'm aiming to get into GCC 10.
> 
> Which is fine.
> 
>>
>>
>>>
>>> c) I'm a bit confused by the pointer tags, where clang does:
>>>
>>
>> Yeah -- I left out a description of short-tags.
>> Hopefully my explanation in the below email helped.
>> https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01968.html
>>
>>>
>>> d) Are you planning to come up with inline stack variable
>>> tagging/untagging for GCC 10?
>>
>> To make sure I understand the question correctly:
>> I think you're asking about writing tags into the shadow memory.
> 
> Yes, which will be very similar to current ASAN code in 
> asan_emit_stack_protection.
> I would expect a significant speed up from the inline shadow memory tag 
> emission?

Yes, I would als

  1   2   3   >