from:"Matthew Malcomson"

These flags can't be used at the same time as any of the other
sanitizers.
We add an equivalent flag to -static-libasan in -static-libhwasan to
ensure static linking.

The -fsanitize=kernel-hwaddress option is for compiling targeting the
kernel.  This flag has defaults that allow compiling KASAN with tags as
it is currently implemented.
These defaults are that we do not sanitize variables on the stack and
always recover from a detected bug.
Stack tagging in the kernel is a future aim, I don't know of any reason
it would not work, but this has not yet been tested.

We introduce a backend hook `targetm.memtag.can_tag_addresses` that
indicates to the mid-end whether a target has a feature like AArch64 TBI
where the top byte of an address is ignored.
Without this feature hwasan sanitization is not done.

gcc/ChangeLog:

* common.opt (flag_sanitize_recover): Default for kernel
hwaddress.
(static-libhwasan): New cli option.
* config/aarch64/aarch64.c (aarch64_can_tag_addresses): New.
(TARGET_MEMTAG_CAN_TAG_ADDRESSES): New.
* config/gnu-user.h (LIBHWASAN_EARLY_SPEC): hwasan equivalent of
asan command line flags.
* cppbuiltin.c (define_builtin_macros_for_compilation_flags):
Add hwasan equivalent of __SANITIZE_ADDRESS__.
* doc/invoke.texi: Document hwasan command line flags.
* doc/tm.texi: Document new hook.
* doc/tm.texi.in: Document new hook.
* flag-types.h (enum sanitize_code): New sanitizer values.
* gcc.c (STATIC_LIBHWASAN_LIBS): New macro.
(LIBHWASAN_SPEC): New macro.
(LIBHWASAN_EARLY_SPEC): New macro.
(SANITIZER_EARLY_SPEC): Update to include hwasan.
(SANITIZER_SPEC): Update to include hwasan.
(sanitize_spec_function): Use hwasan options.
* opts.c (finish_options): Describe conflicts between address
sanitizers.
(sanitizer_opts): Introduce new sanitizer flags.
(common_handle_option): Add defaults for kernel sanitizer.
* params.opt (hwasan--instrument-stack): New
(hwasan-random-frame-tag): New
(hwasan-instrument-allocas): New
(hwasan-instrument-reads): New
(hwasan-instrument-writes): New
(hwasan-instrument-mem-intrinsics): New
* target.def (HOOK_PREFIX): Add new hook.
(can_tag_addresses): Add new hook under memtag prefix.
* targhooks.c (default_memtag_can_tag_addresses): New.
* targhooks.h (default_memtag_can_tag_addresses): New decl.
* toplev.c (process_options): Ensure hwasan only on TBI
architectures.

gcc/c-family/ChangeLog:

* c-attribs.c (handle_no_sanitize_hwaddress_attribute): New
attribute.



### Attachment also inlined for ease of reply###


diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 
372148315389db6671dfd943fd1a68670fcb1cbc..f8bf165aa48b5709c26f4e8245e5ab929b44fca6
 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -54,6 +54,8 @@ static tree handle_cold_attribute (tree *, tree, tree, int, 
bool *);
 static tree handle_no_sanitize_attribute (tree *, tree, tree, int, bool *);
 static tree handle_no_sanitize_address_attribute (tree *, tree, tree,
  int, bool *);
+static tree handle_no_sanitize_hwaddress_attribute (tree *, tree, tree,
+   int, bool *);
 static tree handle_no_sanitize_thread_attribute (tree *, tree, tree,
 int, bool *);
 static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree,
@@ -412,6 +414,8 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_no_sanitize_attribute, NULL },
   { "no_sanitize_address",0, 0, true, false, false, false,
  handle_no_sanitize_address_attribute, NULL },
+  { "no_sanitize_hwaddress",0, 0, true, false, false, false,
+ handle_no_sanitize_hwaddress_attribute, NULL },
   { "no_sanitize_thread", 0, 0, true, false, false, false,
  handle_no_sanitize_thread_attribute, NULL },
   { "no_sanitize_undefined",  0, 0, true, false, false, false,
@@ -946,6 +950,22 @@ handle_no_sanitize_address_attribute (tree *node, tree 
name, tree, int,
   return NULL_TREE;
 }
 
+/* Handle a "no_sanitize_hwaddress" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_no_sanitize_hwaddress_attribute (tree *node, tree name, tree, int,
+ bool *no_add_attrs)
+{
+  *no_add_attrs = true;
+  if (TREE_CODE (*node) != FUNCTION_DECL)
+warning (OPT_Wattributes, "%qE attribute ignored", name);
+  else
+add_no_sanitize_value (*node, SANITIZE_HWADDRESS);
+
+  return NULL_TREE;
+}
+
 /* Handle a "no_sanitize_thread" attribute; arguments as in
struct

[PATCH 6/X] libsanitizer: Add hwasan pass and associated gimple changes

There are four main features to this change:

1) Check pointer tags match address tags.

In the new `hwasan` pass we put HWASAN_CHECK internal functions before
all memory accesses to check that tags in the pointer being used match
the tag stored in shadow memory for the memory region being used.

These internal functions are expanded into actual checks in the sanopt
pass that happens just before expansion into RTL.

We use the same mechanism that currently inserts ASAN_CHECK internal
functions to insert the new HWASAN_CHECK functions.

2) Instrument known builtin function calls.

Handle all builtin functions that we know use memory accesses.
This commit uses the machinery added for ASAN to identify builtin
functions that access memory.

The main differences between the approaches for HWASAN and ASAN are:
 - libhwasan intercepts much less builtin functions.
 - Alloca needs to be transformed differently (instead of adding
   redzones it needs to tag shadow memory and return a tagged pointer).
 - stack_restore needs to untag the shadow stack between the current
   position and where it's going.
 - `noreturn` functions can not be handled by simply unpoisoning the
   entire shadow stack -- there is no "always valid" tag.
   (exceptions and things such as longjmp need to be handled in a
   different way).

For hardware implemented checking (such as AArch64's memory tagging
extension) alloca and stack_restore will need to be handled by hooks in
the backend rather than transformation at the gimple level.  This will
allow architecture specific handling of such stack modifications.

3) Introduce HWASAN block-scope poisoning

Here we use exactly the same mechanism as ASAN_MARK to poison/unpoison
variables on entry/exit of a block.

In order to simply use the exact same machinery we're using the same
internal functions until the SANOPT pass.  This means that all handling
of ASAN_MARK is the same.
This has the negative that the naming may be a little confusing, but a
positive that handling of the internal function doesn't have to be
duplicated for a function that behaves exactly the same but has a
different name.

gcc/ChangeLog:

* asan.c (asan_instrument_reads): New.
(asan_instrument_writes): New.
(asan_memintrin): New.
(handle_builtin_stack_restore): Account for HWASAN.
(handle_builtin_alloca): Account for HWASAN.
(get_mem_refs_of_builtin_call): Special case strlen for HWASAN.
(report_error_func): Assert not HWASAN.
(build_check_stmt): Make HWASAN_CHECK instead of ASAN_CHECK.
(instrument_derefs): HWASAN does not tag globals.
(instrument_builtin_call): Use new helper functions.
(maybe_instrument_call): Don't instrument `noreturn` functions.
(initialize_sanitizer_builtins): Add new type.
(asan_expand_mark_ifn): Account for HWASAN.
(asan_expand_check_ifn): Assert never called by HWASAN.
(asan_expand_poison_ifn): Account for HWASAN.
(hwasan_instrument_reads): New.
(hwasan_instrument_writes): New.
(hwasan_memintrin): New.
(hwasan_instrument): New.
(hwasan_base): New.
(hwasan_check_func): New.
(hwasan_expand_check_ifn): New.
(hwasan_expand_mark_ifn): New.
(gate_hwasan): New.
(class pass_hwasan): New.
(make_pass_hwasan): New.
(class pass_hwasan_O0): New.
(make_pass_hwasan_O0): New.
* asan.h (hwasan_base): New decl.
(hwasan_expand_check_ifn): New decl.
(hwasan_expand_mark_ifn): New decl.
(gate_hwasan): New decl.
(enum hwasan_mark_flags): New.
(asan_intercepted_p): Always false for hwasan.
(asan_sanitize_use_after_scope): Account for HWASAN.
* builtin-types.def (BT_FN_PTR_CONST_PTR_UINT8): New.
* gimple-pretty-print.c (dump_gimple_call_args): Account for
HWASAN.
* gimplify.c (asan_poison_variable): Account for HWASAN.
(gimplify_function_tree): Remove requirement of
SANITIZE_ADDRESS, requiring asan or hwasan is accounted for in
`asan_sanitize_use_after_scope`.
* internal-fn.c (expand_HWASAN_CHECK): New.
(expand_HWASAN_CHOOSE_TAG): New.
(expand_HWASAN_MARK): New.
(expand_HWASAN_ALLOCA_UNPOISON): New.
* internal-fn.def (HWASAN_CHOOSE_TAG): New.
(HWASAN_CHECK): New.
(HWASAN_MARK): New.
(HWASAN_ALLOCA_UNPOISON): New.
* passes.def: Add hwasan and hwasan_O0 passes.
* sanitizer.def (BUILT_IN_HWASAN_LOAD1): New.
(BUILT_IN_HWASAN_LOAD2): New.
(BUILT_IN_HWASAN_LOAD4): New.
(BUILT_IN_HWASAN_LOAD8): New.
(BUILT_IN_HWASAN_LOAD16): New.
(BUILT_IN_HWASAN_LOADN): New.
(BUILT_IN_HWASAN_STORE1): New.
(BUILT_IN_HWASAN_STORE2): New.
(BUILT_IN_HWASAN_STORE4): New.
(BUILT_IN_HWASAN_STORE8): New.
(BUILT_IN_HWASAN_STORE16): New.
(BUILT_IN_HWASAN_STOREN):

[PATCH 5/X] libsanitizer: mid-end: Introduce stack variable handling for HWASAN

Handling stack variables has three features.

1) Ensure HWASAN required alignment for stack variables

When tagging shadow memory, we need to ensure that each tag granule is
only used by one variable at a time.

This is done by ensuring that each tagged variable is aligned to the tag
granule representation size and also ensure that the end of each
object is aligned to ensure the start of any other data stored on the
stack is in a different granule.

This patch ensures the above by adding alignment requirements in
`align_local_variable` and forcing all stack variable allocation to be
deferred so that `expand_stack_vars` can ensure the stack pointer is
aligned before allocating any variable for the current frame.

2) Put tags into each stack variable pointer

Make sure that every pointer to a stack variable includes a tag of some
sort on it.

The way tagging works is:
  1) For every new stack frame, a random tag is generated.
  2) A base register is formed from the stack pointer value and this
 random tag.
  3) References to stack variables are now formed with RTL describing an
 offset from this base in both tag and value.

The random tag generation is handled by a backend hook.  This hook
decides whether to introduce a random tag or use the stack background
based on the parameter hwasan-random-frame-tag.  Using the stack
background is necessary for testing and bootstrap.  It is necessary
during bootstrap to avoid breaking the `configure` test program for
determining stack direction.

Using the stack background means that every stack frame has the initial
tag of zero and variables are tagged with incrementing tags from 1,
which also makes debugging a bit easier.

The tag offsets are also handled by a backend hook.

This patch also adds some macros defining how the HWASAN shadow memory
is stored and how a tag is stored in a pointer.

3) For each stack variable, tag and untag the shadow stack on function
   prologue and epilogue.

On entry to each function we tag the relevant shadow stack region for
each stack variable the tag to match the tag added to each pointer for
that variable.

This is the first patch where we use the HWASAN shadow space, so we need
to add in the libhwasan initialisation code that creates this shadow
memory region into the binary we produce.  This instrumentation is done
in `compile_file`.

When exiting a function we need to ensure the shadow stack for this
function has no remaining tag.  Without clearing the shadow stack area
for this stack frame, later function calls could get false positives
when those later function calls check untagged areas (such as parameters
passed on the stack) against a shadow stack area with left-over tag.

Hence we ensure that the entire stack frame is cleared on function exit.

config/ChangeLog:

* bootstrap-hwasan.mk: Disable random frame tags for
stack-tagging during bootstrap.

gcc/ChangeLog:

* asan.c (hwasan_record_base): New function.
(hwasan_emit_untag_frame): New.
(hwasan_increment_tag): New function.
(hwasan_with_tag): New function.
(hwasan_tag_init): New function.
(initialize_sanitizer_builtins): Define new builtins.
(ATTR_NOTHROW_LIST): New macro.
(hwasan_current_tag): New.
(hwasan_extract_tag): New.
(hwasan_emit_prologue): New.
(hwasan_create_untagged_base): New.
(hwasan_finish_file): New.
(hwasan_ctor_statements): New variable.
(hwasan_sanitize_stack_p): New.
(hwasan_sanitize_p): New.
(hwasan_sanitize_allocas_p): New.
* asan.h (hwasan_record_base): New declaration.
(hwasan_emit_untag_frame): New.
(hwasan_increment_tag): New declaration.
(hwasan_with_tag): New declaration.
(hwasan_sanitize_stack_p): New declaration.
(hwasan_sanitize_allocas_p): New declaration.
(hwasan_tag_init): New declaration.
(hwasan_sanitize_p): New declaration.
(HWASAN_TAG_SIZE): New macro.
(HWASAN_TAG_GRANULE_SIZE): New macro.
(HWASAN_TAG_SHIFT_SIZE): New macro.
(HWASAN_SHIFT): New macro.
(HWASAN_SHIFT_RTX): New macro.
(HWASAN_STACK_BACKGROUND): New macro.
(hwasan_finish_file): New declaration.
(hwasan_current_tag): New declaration.
(hwasan_create_untagged_base): New declaration.
(hwasan_extract_tag): New declaration.
(hwasan_emit_prologue): New declaration.
* cfgexpand.c (struct stack_vars_data): Add information to
record hwasan variable stack offsets.
(expand_stack_vars): Ensure variables are offset from a tagged
base. Record offsets for hwasan. Ensure alignment.
(expand_used_vars): Call function to emit prologue, and get
untagging instructions for function exit.
(align_local_variable): Ensure alignment.
(defer_stack_allocation): Ensure all variables are deferred so
they can be handled by

[PATCH 7/X] libsanitizer: Add tests

Adding hwasan tests.

Only interesting thing here is that we have to make sure the tagging mechanism
is deterministic to avoid flaky tests.

gcc/testsuite/ChangeLog:

* c-c++-common/hwasan/aligned-alloc.c: New test.
* c-c++-common/hwasan/alloca-array-accessible.c: New test.
* c-c++-common/hwasan/alloca-gets-different-tag.c: New test.
* c-c++-common/hwasan/alloca-outside-caught.c: New test.
* c-c++-common/hwasan/arguments.c: New test.
* c-c++-common/hwasan/arguments-1.c: New test.
* c-c++-common/hwasan/arguments-2.c: New test.
* c-c++-common/hwasan/arguments-3.c: New test.
* c-c++-common/hwasan/asan-pr63316.c: New test.
* c-c++-common/hwasan/asan-pr70541.c: New test.
* c-c++-common/hwasan/asan-pr78106.c: New test.
* c-c++-common/hwasan/asan-pr79944.c: New test.
* c-c++-common/hwasan/asan-rlimit-mmap-test-1.c: New test.
* c-c++-common/hwasan/bitfield-1.c: New test.
* c-c++-common/hwasan/bitfield-2.c: New test.
* c-c++-common/hwasan/builtin-special-handling.c: New test.
* c-c++-common/hwasan/check-interface.c: New test.
* c-c++-common/hwasan/halt_on_error-1.c: New test.
* c-c++-common/hwasan/heap-overflow.c: New test.
* c-c++-common/hwasan/hwasan-poison-optimisation.c: New test.
* c-c++-common/hwasan/hwasan-thread-access-parent.c: New test.
* c-c++-common/hwasan/hwasan-thread-basic-failure.c: New test.
* c-c++-common/hwasan/hwasan-thread-clears-stack.c: New test.
* c-c++-common/hwasan/hwasan-thread-success.c: New test.
* c-c++-common/hwasan/kernel-defaults.c: New test.
* c-c++-common/hwasan/large-aligned-0.c: New test.
* c-c++-common/hwasan/large-aligned-1.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-0.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-1.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-2.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-3.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-4.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-5.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-6.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-7.c: New test.
* c-c++-common/hwasan/macro-definition.c: New test.
* c-c++-common/hwasan/no-sanitize-attribute.c: New test.
* c-c++-common/hwasan/param-instrument-reads-and-writes.c: New test.
* c-c++-common/hwasan/param-instrument-reads.c: New test.
* c-c++-common/hwasan/param-instrument-writes.c: New test.
* c-c++-common/hwasan/param-instrument-mem-intrinsics.c: New test.
* c-c++-common/hwasan/random-frame-tag.c: New test.
* c-c++-common/hwasan/sanity-check-pure-c.c: New test.
* c-c++-common/hwasan/setjmp-longjmp-0.c: New test.
* c-c++-common/hwasan/setjmp-longjmp-1.c: New test.
* c-c++-common/hwasan/stack-tagging-basic-0.c: New test.
* c-c++-common/hwasan/stack-tagging-basic-1.c: New test.
* c-c++-common/hwasan/stack-tagging-disable.c: New test.
* c-c++-common/hwasan/unprotected-allocas-0.c: New test.
* c-c++-common/hwasan/unprotected-allocas-1.c: New test.
* c-c++-common/hwasan/use-after-free.c: New test.
* c-c++-common/hwasan/vararray-outside-caught.c: New test.
* c-c++-common/hwasan/vararray-stack-restore-correct.c: New test.
* c-c++-common/hwasan/very-large-objects.c: New test.
* g++.dg/hwasan/hwasan.exp: New file.
* g++.dg/hwasan/rvo-handled.C: New test.
* gcc.dg/hwasan/hwasan.exp: New file.
* gcc.dg/hwasan/nested-functions-0.c: New test.
* gcc.dg/hwasan/nested-functions-1.c: New test.
* gcc.dg/hwasan/nested-functions-2.c: New test.
* lib/hwasan-dg.exp: New file.


hwasan-diff6.patch.gz
Description: application/gzip

[PATCH 1/X] libsanitizer: Tie the hwasan library into our build system

This patch tries to tie libhwasan into the GCC build system in the same way
that the other sanitizer runtime libraries are handled.

libsanitizer/ChangeLog:

* Makefile.am:  Build libhwasan.
* Makefile.in:  Build libhwasan.
* asan/Makefile.in:  Build libhwasan.
* configure:  Build libhwasan.
* configure.ac:  Build libhwasan.
* hwasan/Makefile.am: New file.
* hwasan/Makefile.in: New file.
* hwasan/libtool-version: New file.
* interception/Makefile.in: Build libhwasan.
* libbacktrace/Makefile.in: Build libhwasan.
* libsanitizer.spec.in: Build libhwasan.
* lsan/Makefile.in: Build libhwasan.
* sanitizer_common/Makefile.in: Build libhwasan.
* tsan/Makefile.in: Build libhwasan.
* ubsan/Makefile.in: Build libhwasan.



### Attachment also inlined for ease of reply###


diff --git a/libsanitizer/Makefile.am b/libsanitizer/Makefile.am
index 
65ed1e712378ef453f820f86c4d3221f9dee5f2c..2a7e8e1debe838719db0f0fad218b2543cc3111b
 100644
--- a/libsanitizer/Makefile.am
+++ b/libsanitizer/Makefile.am
@@ -14,11 +14,12 @@ endif
 if LIBBACKTRACE_SUPPORTED
 SUBDIRS += libbacktrace
 endif
-SUBDIRS += lsan asan ubsan
+SUBDIRS += lsan asan ubsan hwasan
 nodist_saninclude_HEADERS += \
   include/sanitizer/lsan_interface.h \
   include/sanitizer/asan_interface.h \
-  include/sanitizer/tsan_interface.h
+  include/sanitizer/tsan_interface.h \
+  include/sanitizer/hwasan_interface.h
 if TSAN_SUPPORTED
 SUBDIRS += tsan
 endif
diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in
index 
02c7f70ac6578a3e93a490ce8bd2c54fc0693c50..2c57d49cbffdb486645aeb5f2c0f85d6e0fad124
 100644
--- a/libsanitizer/Makefile.in
+++ b/libsanitizer/Makefile.in
@@ -92,7 +92,8 @@ target_triplet = @target@
 @SANITIZER_SUPPORTED_TRUE@am__append_1 = 
include/sanitizer/common_interface_defs.h \
 @SANITIZER_SUPPORTED_TRUE@ include/sanitizer/lsan_interface.h \
 @SANITIZER_SUPPORTED_TRUE@ include/sanitizer/asan_interface.h \
-@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/tsan_interface.h
+@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/tsan_interface.h \
+@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/hwasan_interface.h
 @SANITIZER_SUPPORTED_TRUE@@USING_MAC_INTERPOSE_FALSE@am__append_2 = 
interception
 @LIBBACKTRACE_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_3 = 
libbacktrace
 @SANITIZER_SUPPORTED_TRUE@@TSAN_SUPPORTED_TRUE@am__append_4 = tsan
@@ -207,7 +208,7 @@ ETAGS = etags
 CTAGS = ctags
 CSCOPE = cscope
 DIST_SUBDIRS = sanitizer_common interception libbacktrace lsan asan \
-   ubsan tsan
+   ubsan hwasan tsan
 ACLOCAL = @ACLOCAL@
 ALLOC_FILE = @ALLOC_FILE@
 AMTAR = @AMTAR@
@@ -329,6 +330,7 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libasan = @link_libasan@
+link_libhwasan = @link_libhwasan@
 link_liblsan = @link_liblsan@
 link_libtsan = @link_libtsan@
 link_libubsan = @link_libubsan@
@@ -362,7 +364,7 @@ sanincludedir = 
$(libdir)/gcc/$(target_alias)/$(gcc_version)/include/sanitizer
 nodist_saninclude_HEADERS = $(am__append_1)
 @SANITIZER_SUPPORTED_TRUE@SUBDIRS = sanitizer_common $(am__append_2) \
 @SANITIZER_SUPPORTED_TRUE@ $(am__append_3) lsan asan ubsan \
-@SANITIZER_SUPPORTED_TRUE@ $(am__append_4)
+@SANITIZER_SUPPORTED_TRUE@ hwasan $(am__append_4)
 gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
diff --git a/libsanitizer/asan/Makefile.in b/libsanitizer/asan/Makefile.in
index 
29622bf466a37f819c9fade30e31195adda51190..25c7fd7b7597d6e243005a1bb7de5b6243d2cfcf
 100644
--- a/libsanitizer/asan/Makefile.in
+++ b/libsanitizer/asan/Makefile.in
@@ -383,6 +383,7 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libasan = @link_libasan@
+link_libhwasan = @link_libhwasan@
 link_liblsan = @link_liblsan@
 link_libtsan = @link_libtsan@
 link_libubsan = @link_libubsan@
diff --git a/libsanitizer/configure b/libsanitizer/configure
index 
04eca04fbe5e59bae1ba00597de0cf1b7cf1b5fa..9ed9669a85d3cfc2f2f623e796e61a5f8f7e4ded
 100755
--- a/libsanitizer/configure
+++ b/libsanitizer/configure
@@ -657,6 +657,7 @@ USING_MAC_INTERPOSE_TRUE
 link_liblsan
 link_libubsan
 link_libtsan
+link_libhwasan
 link_libasan
 LSAN_SUPPORTED_FALSE
 LSAN_SUPPORTED_TRUE
@@ -12361,7 +12362,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12364 "configure"
+#line 12365 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -12467,7 +12468,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12470 "configure"
+#line 12471 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -15943,6 +15944,10 @@ fi
 link_libasan=$link_sanitizer_common
 
 
+# Set up the set of additional

[PATCH 2/X] libsanitizer: Only build libhwasan when targeting AArch64

Though the library has limited support for x86, we don't have any
support for generating code targeting x86 so there is no point building
for that target.

libsanitizer/ChangeLog:

* Makefile.am: Condition building hwasan directory.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac: Set HWASAN_SUPPORTED based on target
architecture.
* configure.tgt: Likewise.



### Attachment also inlined for ease of reply###


diff --git a/libsanitizer/Makefile.am b/libsanitizer/Makefile.am
index 
2a7e8e1debe838719db0f0fad218b2543cc3111b..065a65e78d49f7689a01ecb64db1f07ca83aa987
 100644
--- a/libsanitizer/Makefile.am
+++ b/libsanitizer/Makefile.am
@@ -14,7 +14,7 @@ endif
 if LIBBACKTRACE_SUPPORTED
 SUBDIRS += libbacktrace
 endif
-SUBDIRS += lsan asan ubsan hwasan
+SUBDIRS += lsan asan ubsan
 nodist_saninclude_HEADERS += \
   include/sanitizer/lsan_interface.h \
   include/sanitizer/asan_interface.h \
@@ -23,6 +23,9 @@ nodist_saninclude_HEADERS += \
 if TSAN_SUPPORTED
 SUBDIRS += tsan
 endif
+if HWASAN_SUPPORTED
+SUBDIRS += hwasan
+endif
 endif
 
 ## May be used by toolexeclibdir.
diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in
index 
2c57d49cbffdb486645aeb5f2c0f85d6e0fad124..3873ea4d7050f04a3f7bbd0dd3f2a71e9b65d287
 100644
--- a/libsanitizer/Makefile.in
+++ b/libsanitizer/Makefile.in
@@ -97,6 +97,7 @@ target_triplet = @target@
 @SANITIZER_SUPPORTED_TRUE@@USING_MAC_INTERPOSE_FALSE@am__append_2 = 
interception
 @LIBBACKTRACE_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_3 = 
libbacktrace
 @SANITIZER_SUPPORTED_TRUE@@TSAN_SUPPORTED_TRUE@am__append_4 = tsan
+@HWASAN_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_5 = hwasan
 subdir = .
 ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
 am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \
@@ -208,7 +209,7 @@ ETAGS = etags
 CTAGS = ctags
 CSCOPE = cscope
 DIST_SUBDIRS = sanitizer_common interception libbacktrace lsan asan \
-   ubsan hwasan tsan
+   ubsan tsan hwasan
 ACLOCAL = @ACLOCAL@
 ALLOC_FILE = @ALLOC_FILE@
 AMTAR = @AMTAR@
@@ -364,7 +365,7 @@ sanincludedir = 
$(libdir)/gcc/$(target_alias)/$(gcc_version)/include/sanitizer
 nodist_saninclude_HEADERS = $(am__append_1)
 @SANITIZER_SUPPORTED_TRUE@SUBDIRS = sanitizer_common $(am__append_2) \
 @SANITIZER_SUPPORTED_TRUE@ $(am__append_3) lsan asan ubsan \
-@SANITIZER_SUPPORTED_TRUE@ hwasan $(am__append_4)
+@SANITIZER_SUPPORTED_TRUE@ $(am__append_4) $(am__append_5)
 gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
diff --git a/libsanitizer/configure b/libsanitizer/configure
index 
9ed9669a85d3cfc2f2f623e796e61a5f8f7e4ded..cc5c229f4aebcdd454e9e2e415a8e16046dc1b1a
 100755
--- a/libsanitizer/configure
+++ b/libsanitizer/configure
@@ -659,6 +659,8 @@ link_libubsan
 link_libtsan
 link_libhwasan
 link_libasan
+HWASAN_SUPPORTED_FALSE
+HWASAN_SUPPORTED_TRUE
 LSAN_SUPPORTED_FALSE
 LSAN_SUPPORTED_TRUE
 TSAN_SUPPORTED_FALSE
@@ -12362,7 +12364,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12365 "configure"
+#line 12367 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -12468,7 +12470,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12471 "configure"
+#line 12473 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -15819,6 +15821,7 @@ fi
 # Get target configury.
 unset TSAN_SUPPORTED
 unset LSAN_SUPPORTED
+unset HWASAN_SUPPORTED
 . ${srcdir}/configure.tgt
  if test "x$TSAN_SUPPORTED" = "xyes"; then
   TSAN_SUPPORTED_TRUE=
@@ -15836,6 +15839,14 @@ else
   LSAN_SUPPORTED_FALSE=
 fi
 
+ if test "x$HWASAN_SUPPORTED" = "xyes"; then
+  HWASAN_SUPPORTED_TRUE=
+  HWASAN_SUPPORTED_FALSE='#'
+else
+  HWASAN_SUPPORTED_TRUE='#'
+  HWASAN_SUPPORTED_FALSE=
+fi
+
 
 # Check for functions needed.
 for ac_func in clock_getres clock_gettime clock_settime lstat readlink
@@ -16818,7 +16829,7 @@ ac_config_files="$ac_config_files Makefile 
libsanitizer.spec libbacktrace/backtr
 ac_config_headers="$ac_config_headers config.h"
 
 
-ac_config_files="$ac_config_files interception/Makefile 
sanitizer_common/Makefile libbacktrace/Makefile lsan/Makefile asan/Makefile 
hwasan/Makefile ubsan/Makefile"
+ac_config_files="$ac_config_files interception/Makefile 
sanitizer_common/Makefile libbacktrace/Makefile lsan/Makefile asan/Makefile 
ubsan/Makefile"
 
 
 if test "x$TSAN_SUPPORTED" = "xyes"; then
@@ -16826,6 +16837,11 @@ if test "x$TSAN_SUPPORTED" = "xyes"; then
 
 fi
 
+if test "x$HWASAN_SUPPORTED" = "xyes"; then
+  ac_config_files="$ac_config_files hwasan/Makefile"
+
+fi
+
 
 
 
@@ -17090,6 +17106,10 @@ if test -z "${LSAN_SUPPORTED_TRUE}" && test -z 
"${LSAN_SUPPORTED_FALSE}"; then
   as_fn_error $? "conditional \"LSAN_SUPPORTED\" was never defined.
 Usually

[PATCH 3/X] libsanitizer: Add option to bootstrap using HWASAN

This is an analogous option to --bootstrap-asan to configure.  It allows
bootstrapping GCC using HWASAN.

For the same reasons as for ASAN we have to avoid using the HWASAN
sanitizer when compiling libiberty and the lto-plugin.

Also add a function to query whether -fsanitize=hwaddress has been
passed.

ChangeLog:

* configure: Regenerate.
* configure.ac: Add --bootstrap-hwasan option.

config/ChangeLog:

* bootstrap-hwasan.mk: New file.

gcc/ChangeLog:

* doc/install.texi: Document new option.

libiberty/ChangeLog:

* configure: Regenerate.
* configure.ac: Avoid using sanitizer.

lto-plugin/ChangeLog:

* Makefile.am: Avoid using sanitizer.
* Makefile.in: Regenerate.



### Attachment also inlined for ease of reply###


diff --git a/config/bootstrap-hwasan.mk b/config/bootstrap-hwasan.mk
new file mode 100644
index 
..4f60bed3fd6e98b47a3a38aea6eba2a7c320da25
--- /dev/null
+++ b/config/bootstrap-hwasan.mk
@@ -0,0 +1,8 @@
+# This option enables -fsanitize=hwaddress for stage2 and stage3.
+
+STAGE2_CFLAGS += -fsanitize=hwaddress
+STAGE3_CFLAGS += -fsanitize=hwaddress
+POSTSTAGE1_LDFLAGS += -fsanitize=hwaddress -static-libhwasan \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/ \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/ \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/.libs
diff --git a/configure b/configure
index 
a0c5aca9e8d5cae2782c8fe4625a501853dc226a..203319e3f899e8d24429950c3a5d22927fb5150f
 100755
--- a/configure
+++ b/configure
@@ -8297,7 +8297,7 @@ fi
 # or bootstrap-ubsan, bootstrap it.
 if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; then
   case "$BUILD_CONFIG" in
-*bootstrap-asan* | *bootstrap-ubsan* )
+*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* )
   bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer,
   bootstrap_fixincludes=yes
   ;;
diff --git a/configure.ac b/configure.ac
index 
1a53ed418e4d97606356b14a17b50186c79adcd3..9d5c187c31bfc01003e75058896b686807e47643
 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2809,7 +2809,7 @@ fi
 # or bootstrap-ubsan, bootstrap it.
 if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; then
   case "$BUILD_CONFIG" in
-*bootstrap-asan* | *bootstrap-ubsan* )
+*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* )
   bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer,
   bootstrap_fixincludes=yes
   ;;
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 
d581a34653f61a440b3c3b832836fe109e2fbd08..25d041fcbb1f7c16f7ac47b7b5d4ea8308c6f69c
 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2767,6 +2767,11 @@ the build tree.
 Compiles GCC itself using Address Sanitization in order to catch invalid memory
 accesses within the GCC code.
 
+@item @samp{bootstrap-hwasan}
+Compiles GCC itself using HWAddress Sanitization in order to catch invalid
+memory accesses within the GCC code.  This option is only available on AArch64
+targets running a Linux kernel that supports the required ABI (5.4 or later).
+
 @end table
 
 @section Building a cross compiler
diff --git a/libiberty/configure b/libiberty/configure
index 
1f8e23f0d235a6a5d5158bf6705023db95ac7023..59e0b73d5838bbd42a5548759084471e97ec254f
 100755
--- a/libiberty/configure
+++ b/libiberty/configure
@@ -5264,6 +5264,7 @@ fi
 NOASANFLAG=
 case " ${CFLAGS} " in
   *\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;;
+  *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;;
 esac
 
 
diff --git a/libiberty/configure.ac b/libiberty/configure.ac
index 
4e2599c14a89bafcb8c7e523b9ce5b3d60b8c0f6..ad952963971a31968b5d109661b9cab0aa4b95fc
 100644
--- a/libiberty/configure.ac
+++ b/libiberty/configure.ac
@@ -240,6 +240,7 @@ AC_SUBST(PICFLAG)
 NOASANFLAG=
 case " ${CFLAGS} " in
   *\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;;
+  *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;;
 esac
 AC_SUBST(NOASANFLAG)
 
diff --git a/lto-plugin/Makefile.am b/lto-plugin/Makefile.am
index 
ba5882df7a7272f65219191c82ecd78ab4d3725e..50d6e09dac881d28d4ff70def47b09ed8c0ea66c
 100644
--- a/lto-plugin/Makefile.am
+++ b/lto-plugin/Makefile.am
@@ -11,8 +11,8 @@ AM_CPPFLAGS = -I$(top_srcdir)/../include $(DEFS)
 AM_CFLAGS = @ac_lto_plugin_warn_cflags@ $(CET_HOST_FLAGS)
 AM_LDFLAGS = @ac_lto_plugin_ldflags@
 AM_LIBTOOLFLAGS = --tag=disable-static
-override CFLAGS := $(filter-out -fsanitize=address,$(CFLAGS))
-override LDFLAGS := $(filter-out -fsanitize=address,$(LDFLAGS))
+override CFLAGS := $(filter-out -fsanitize=address 
-fsanitize=hwaddress,$(CFLAGS))
+override LDFLAGS := $(filter-out -fsanitize=address 
-fsanitize=hwaddress,$(LDFLAGS))
 
 libexecsub_LTLIBRARIES = liblto_plugin.la
 gcc_build_dir = @gcc_build_dir@
diff --git

Re: SLS Mitigation patches backported for GCC9

2020-07-24 Thread Matthew Malcomson

On 24/07/2020 12:01, Kyrylo Tkachov wrote:

Hi Matthew,

-Original Message-
From: Matthew Malcomson 
Sent: 21 July 2020 16:16
To: gcc-patches@gcc.gnu.org
Cc: Richard Earnshaw ; Kyrylo Tkachov
; Ross Burton 
Subject: SLS Mitigation patches backported for GCC9

Hello,

Eventually we will want to backport the SLS patches to older branches.

When the GCC10 release is unfrozen we will work on getting the same
patches
already posted backported to that branch.  The patches already posted on
the
mailing list apply cleanly to the current releases/gcc-10 branch.

I've heard interest in having the GCC 9 patches, so I'm posting the modified
versions upstream sooner than otherwise.

I'd say let's go ahead with the GCC 10 patches (assuming testing works out well 
on there).
For the GCC 9 patches it would be useful if you included a bit of text of how 
they differ from the GCC 10/11 patches.
This would speed up the technical review.
Thanks,
Kyrill

Cheers,
Matthew

Entire patch series attached to cover letter.

Below were the only two "interesting" hunks that failed to apply after 
`patch -p1`.

The differences causing these were:
- in GCC-9 the `retab` instruction wasn't in the "do_return" pattern.
- `simple_return` had "aarch64_use_simple_return_insn_p ()" as a
   condition.

--- gcc/config/aarch64/aarch64.md
+++ gcc/config/aarch64/aarch64.md
@@ -863,18 +882,23 @@
   [(return)]
   ""
   {
+const char *ret = NULL;
 if (aarch64_return_address_signing_enabled ()
&& TARGET_ARMV8_3
&& !crtl->calls_eh_return)
   {
if (aarch64_ra_sign_key == AARCH64_KEY_B)
- return "retab";
+ ret = "retab";
else
- return "retaa";
+ ret = "retaa";
   }
-return "ret";
+else
+  ret = "ret";
+output_asm_insn (ret, operands);
+return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
   }
-  [(set_attr "type" "branch")]
+  [(set_attr "type" "branch")
+   (set_attr "sls_length" "retbr")]
 )

 (define_expand "return"
@@ -886,8 +910,12 @@
 (define_insn "simple_return"
   [(simple_return)]
   ""
-  "ret"
-  [(set_attr "type" "branch")]
+  {
+output_asm_insn ("ret", operands);
+return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
+  }
+  [(set_attr "type" "branch")
+   (set_attr "sls_length" "retbr")]
 )

 (define_insn "*cb1"

aarch64: (GCC-9 Backport) New Straight Line Speculation (SLS) mitigation flags

Here we introduce the flags that will be used for straight line speculation.

The new flag introduced is `-mharden-sls=`.
This flag can take arguments of `none`, `all`, or a comma seperated list
of one or more of `retbr` or `blr`.
`none` indicates no special mitigation of the straight line speculation
vulnerability.
`all` requests all mitigations currently implemented.
`retbr` requests that the RET and BR instructions have a speculation
barrier inserted after them.
`blr` requests that BLR instructions are replaced by a BL to a function
stub using a BR with a speculation barrier after it.

Setting this on a per-function basis using attributes or the like is not
enabled, but may be in the future.

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (aarch64_harden_sls_retbr_p):
New.
(aarch64_harden_sls_blr_p): New.
* config/aarch64/aarch64.c (enum aarch64_sls_hardening_type):
New.
(aarch64_harden_sls_retbr_p): New.
(aarch64_harden_sls_blr_p): New.
(aarch64_validate_sls_mitigation): New.
(aarch64_override_options): Parse options for SLS mitigation.
* config/aarch64/aarch64.opt (-mharden-sls): New option.
* doc/invoke.texi: Document new option.

(cherry picked from commit a9ba2a9b77bec7eacaf066801f22d1c366a2bc86)


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
af2a17f0bf3b50a306b14d8c0aa431269f54ef2e..db5a6a3a1813abbbeeaaa7009466ac58595e12bc
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -658,4 +658,7 @@ extern const atomic_ool_names aarch64_ool_ldset_names;
 extern const atomic_ool_names aarch64_ool_ldclr_names;
 extern const atomic_ool_names aarch64_ool_ldeor_names;
 
+extern bool aarch64_harden_sls_retbr_p (void);
+extern bool aarch64_harden_sls_blr_p (void);
+
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
1b6e67ccd53da05c88b13cac8b524ba2b8cfe43f..0903370b6128e19016f97cb6b0fd44e6b51e569e
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11791,6 +11791,79 @@ aarch64_validate_mcpu (const char *str, const struct 
processor **res,
   return false;
 }
 
+/* Straight line speculation indicators.  */
+enum aarch64_sls_hardening_type
+{
+  SLS_NONE = 0,
+  SLS_RETBR = 1,
+  SLS_BLR = 2,
+  SLS_ALL = 3,
+};
+static enum aarch64_sls_hardening_type aarch64_sls_hardening;
+
+/* Return whether we should mitigatate Straight Line Speculation for the RET
+   and BR instructions.  */
+bool
+aarch64_harden_sls_retbr_p (void)
+{
+  return aarch64_sls_hardening & SLS_RETBR;
+}
+
+/* Return whether we should mitigatate Straight Line Speculation for the BLR
+   instruction.  */
+bool
+aarch64_harden_sls_blr_p (void)
+{
+  return aarch64_sls_hardening & SLS_BLR;
+}
+
+/* As of yet we only allow setting these options globally, in the future we may
+   allow setting them per function.  */
+static void
+aarch64_validate_sls_mitigation (const char *const_str)
+{
+  char *token_save = NULL;
+  char *str = NULL;
+
+  if (strcmp (const_str, "none") == 0)
+{
+  aarch64_sls_hardening = SLS_NONE;
+  return;
+}
+  if (strcmp (const_str, "all") == 0)
+{
+  aarch64_sls_hardening = SLS_ALL;
+  return;
+}
+
+  char *str_root = xstrdup (const_str);
+  str = strtok_r (str_root, ",", _save);
+  if (!str)
+error ("invalid argument given to %<-mharden-sls=%>");
+
+  int temp = SLS_NONE;
+  while (str)
+{
+  if (strcmp (str, "blr") == 0)
+   temp |= SLS_BLR;
+  else if (strcmp (str, "retbr") == 0)
+   temp |= SLS_RETBR;
+  else if (strcmp (str, "none") == 0 || strcmp (str, "all") == 0)
+   {
+ error ("%<%s%> must be by itself for %<-mharden-sls=%>", str);
+ break;
+   }
+  else
+   {
+ error ("invalid argument %<%s%> for %<-mharden-sls=%>", str);
+ break;
+   }
+  str = strtok_r (NULL, ",", _save);
+}
+  aarch64_sls_hardening = (aarch64_sls_hardening_type) temp;
+  free (str_root);
+}
+
 /* Parses CONST_STR for branch protection features specified in
aarch64_branch_protect_types, and set any global variables required.  
Returns
the parsing result and assigns LAST_STR to the last processed token from
@@ -12029,6 +12102,9 @@ aarch64_override_options (void)
   selected_arch = NULL;
   selected_tune = NULL;
 
+  if (aarch64_harden_sls_string)
+aarch64_validate_sls_mitigation (aarch64_harden_sls_string);
+
   if (aarch64_branch_protection_string)
 aarch64_validate_mbranch_protection (aarch64_branch_protection_string);
 
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index 
f474a28eb9234cf87b22492c4396d5a31857cd39..4beb2d3d243c4ca36e7604c02ab5d9e80955f55a
 100644
--- a/gcc/config/aarch64/aarch64.opt
+++

aarch64: (GCC-9 Backport) Introduce SLS mitigation for RET and BR instructions

Instructions following RET or BR are not necessarily executed.  In order
to avoid speculation past RET and BR we can simply append a speculation
barrier.

Since these speculation barriers will not be architecturally executed,
they are not expected to add a high performance penalty.

The speculation barrier is to be SB when targeting architectures which
have this enabled, and DSB SY + ISB otherwise.

We add tests for each of the cases where such an instruction was seen.

This is implemented by modifying each machine description pattern that
emits either a RET or a BR instruction.  We choose not to use something
like `TARGET_ASM_FUNCTION_EPILOGUE` since it does not affect the
`indirect_jump`, `jump`, `sibcall_insn` and `sibcall_value_insn`
patterns and we find it preferable to implement the functionality in the
same way for every pattern.

There is one particular case which is slightly tricky.  The
implementation of TARGET_ASM_TRAMPOLINE_TEMPLATE uses a BR which needs
to be mitigated against.  The trampoline template is used *once* per
compilation unit, and the TRAMPOLINE_SIZE is exposed to the user via the
builtin macro __LIBGCC_TRAMPOLINE_SIZE__.
In the future we may implement function specific attributes to turn on
and off hardening on a per-function basis.
The fixed nature of the trampoline described above implies it will be
safer to ensure this speculation barrier is always used.

Testing:
  Bootstrap and regtest done on aarch64-none-linux
  Used a temporary hack(1) to use these options on every test in the
  testsuite and a script to check that the output never emitted an
  unmitigated RET or BR.

1) Temporary hack was a change to the testsuite to always use
`-save-temps` and run a script on the assembly output of those
compilations which produced one to ensure every RET or BR is immediately
followed by a speculation barrier.

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (aarch64_sls_barrier): New.
* config/aarch64/aarch64.c (aarch64_output_casesi): Emit
speculation barrier after BR instruction if needs be.
(aarch64_trampoline_init): Handle ptr_mode value & adjust size
of code copied.
(aarch64_sls_barrier): New.
(aarch64_asm_trampoline_template): Add needed barriers.
* config/aarch64/aarch64.h (AARCH64_ISA_SB): New.
(TARGET_SB): New.
(TRAMPOLINE_SIZE): Account for barrier.
* config/aarch64/aarch64.md (indirect_jump, *casesi_dispatch,
simple_return, *do_return, *sibcall_insn, *sibcall_value_insn):
Emit barrier if needs be, also account for possible barrier using
"sls_length" attribute.
(sls_length): New attribute.
(length): Determine default using any non-default sls_length
value.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sls-mitigation/sls-miti-retbr.c: New test.
* gcc.target/aarch64/sls-mitigation/sls-miti-retbr-pacret.c:
New test.
* gcc.target/aarch64/sls-mitigation/sls-mitigation.exp: New file.
* lib/target-supports.exp (check_effective_target_aarch64_asm_sb_ok):
New proc.

(cherry picked from be178ecd5ac1fe1510d960ff95c66d0ff831afe1)


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
db5a6a3a1813abbbeeaaa7009466ac58595e12bc..46196fe648e06f6b7543d8656cd6be6b39f8f774
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -658,6 +658,7 @@ extern const atomic_ool_names aarch64_ool_ldset_names;
 extern const atomic_ool_names aarch64_ool_ldclr_names;
 extern const atomic_ool_names aarch64_ool_ldeor_names;
 
+const char *aarch64_sls_barrier (int);
 extern bool aarch64_harden_sls_retbr_p (void);
 extern bool aarch64_harden_sls_blr_p (void);
 
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
0e9b79e23f05f34cfee5fdccf5792af077a95798..8dcc9b1cec347104ca3bbcc34a45b30fcf35a886
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -235,6 +235,7 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_ISA_F16FML(aarch64_isa_flags & AARCH64_FL_F16FML)
 #define AARCH64_ISA_RCPC8_4   (aarch64_isa_flags & AARCH64_FL_RCPC8_4)
 #define AARCH64_ISA_V8_5  (aarch64_isa_flags & AARCH64_FL_V8_5)
+#define AARCH64_ISA_SB(aarch64_isa_flags & AARCH64_FL_SB)
 
 /* Crypto is an optional extension to AdvSIMD.  */
 #define TARGET_CRYPTO (TARGET_SIMD && AARCH64_ISA_CRYPTO)
@@ -285,6 +286,9 @@ extern unsigned aarch64_architecture_version;
 #define TARGET_FIX_ERR_A53_835769_DEFAULT 1
 #endif
 
+/* SB instruction is enabled through +sb.  */
+#define TARGET_SB (AARCH64_ISA_SB)
+
 /* Apply the workaround for Cortex-A53 erratum 835769.  */
 #define TARGET_FIX_ERR_A53_835769  \
   ((aarch64_fix_a53_err835769 == 2)\
@@ -931,8 +935,10 @@ typedef struct
 
 #define RETURN_ADDR_RTX

aarch64: (GCC-9 Backport) Mitigate SLS for BLR instruction

This patch introduces the mitigation for Straight Line Speculation past
the BLR instruction.

This mitigation replaces BLR instructions with a BL to a stub which uses
a BR to jump to the original value.  These function stubs are then
appended with a speculation barrier to ensure no straight line
speculation happens after these jumps.

When optimising for speed we use a set of stubs for each function since
this should help the branch predictor make more accurate predictions
about where a stub should branch.

When optimising for size we use one set of stubs for all functions.
This set of stubs can have human readable names, and we are using
`__call_indirect_x` for register x.

When BTI branch protection is enabled the BLR instruction can jump to a
`BTI c` instruction using any register, while the BR instruction can
only jump to a `BTI c` instruction using the x16 or x17 registers.
Hence, in order to ensure this transformation is safe we mov the value
of the original register into x16 and use x16 for the BR.

As an example when optimising for size:
a
BLR x0
instruction would get transformed to something like
BL __call_indirect_x0
where __call_indirect_x0 labels a thunk that contains
__call_indirect_x0:
MOV X16, X0
BR X16


The first version of this patch used local symbols specific to a
compilation unit to try and avoid relocations.
This was mistaken since functions coming from the same compilation unit
can still be in different sections, and the assembler will insert
relocations at jumps between sections.

On any relocation the linker is permitted to emit a veneer to handle
jumps between symbols that are very far apart.  The registers x16 and
x17 may be clobbered by these veneers.
Hence the function stubs cannot rely on the values of x16 and x17 being
the same as just before the function stub is called.

Similar can be said for the hot/cold partitioning of single functions,
so function-local stubs have the same restriction.

This updated version of the patch never emits function stubs for x16 and
x17, and instead forces other registers to be used.

Given the above, there is now no benefit to local symbols (since they
are not enough to avoid dealing with linker intricacies).  This patch
now uses global symbols with hidden visibility each stored in their own
COMDAT section.  This means stubs can be shared between compilation
units while still avoiding the PLT indirection.

This patch also removes the `__call_indirect_x30` stub (and
function-local equivalent) which would simply jump back to the original
location.

The function-local stubs are emitted to the assembly output file in one
chunk, which means we need not add the speculation barrier directly
after each one.
This is because we know for certain that the instructions directly after
the BR in all but the last function stub will be from another one of
these stubs and hence will not contain a speculation gadget.
Instead we add a speculation barrier at the end of the sequence of
stubs.

The global stubs are emitted in COMDAT/.linkonce sections by
themselves so that the linker can remove duplicates from multiple object
files.  This means they are not emitted in one chunk, and each one must
include the speculation barrier.

Another difference is that since the global stubs are shared across
compilation units we do not know that all functions will be targeting an
architecture supporting the SB instruction.
Rather than provide multiple stubs for each architecture, we provide a
stub that will work for all architectures -- using the DSB+ISB barrier.

This mitigation does not apply for BLR instructions in the following
places:
- Some accesses to thread-local variables use a code sequence with a BLR
  instruction.  This code sequence is part of the binary interface between
  compiler and linker. If this BLR instruction needs to be mitigated, it'd
  probably be best to do so in the linker. It seems that the code sequence
  for thread-local variable access is unlikely to lead to a Spectre Revalation
  Gadget.
- PLT stubs are produced by the linker and each contain a BLR instruction.
  It seems that at most only after the last PLT stub a Spectre Revalation
  Gadget might appear.

Testing:
  Bootstrap and regtest on AArch64
(with BOOT_CFLAGS="-mharden-sls=retbr,blr")
  Used a temporary hack(1) in gcc-dg.exp to use these options on every
  test in the testsuite, a slight modification to emit the speculation
  barrier after every function stub, and a script to check that the
  output never emitted a BLR, or unmitigated BR or RET instruction.
  Similar on an aarch64-none-elf cross-compiler.

1) Temporary hack emitted a speculation barrier at the end of every stub
function, and used a script to ensure that:
  a) Every RET or BR is immediately followed by a speculation barrier.
  b) No BLR instruction is emitted by compiler.

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (aarch64_indirect_call_asm):
New declaration.
*

SLS Mitigation patches backported for GCC9

Hello,

Eventually we will want to backport the SLS patches to older branches.

When the GCC10 release is unfrozen we will work on getting the same patches
already posted backported to that branch.  The patches already posted on the
mailing list apply cleanly to the current releases/gcc-10 branch.

I've heard interest in having the GCC 9 patches, so I'm posting the modified
versions upstream sooner than otherwise.

Cheers,
Matthew

Entire patch series attached to cover letter.

all-patches.tar.gz
Description: application/gzip

Re: [Patch 2/3] aarch64: Introduce SLS mitigation for RET and BR instructions

2020-07-03 Thread Matthew Malcomson

With suggestions applied.
Testing with `-mabi=ilp32` found a bug around the trampoline
initialisation where the new larger size of the trampoline caused a
different execution path of `emit_block_move` which ICE'd on the
pre-existing `ptr_mode` address.


Commit Message
--

Instructions following RET or BR are not necessarily executed.  In order
to avoid speculation past RET and BR we can simply append a speculation
barrier.

Since these speculation barriers will not be architecturally executed,
they are not expected to add a high performance penalty.

The speculation barrier is to be SB when targeting architectures which
have this enabled, and DSB SY + ISB otherwise.

We add tests for each of the cases where such an instruction was seen.

This is implemented by modifying each machine description pattern that
emits either a RET or a BR instruction.  We choose not to use something
like `TARGET_ASM_FUNCTION_EPILOGUE` since it does not affect the
`indirect_jump`, `jump`, `sibcall_insn` and `sibcall_value_insn`
patterns and we find it preferable to implement the functionality in the
same way for every pattern.

There is one particular case which is slightly tricky.  The
implementation of TARGET_ASM_TRAMPOLINE_TEMPLATE uses a BR which needs
to be mitigated against.  The trampoline template is used *once* per
compilation unit, and the TRAMPOLINE_SIZE is exposed to the user via the
builtin macro __LIBGCC_TRAMPOLINE_SIZE__.
In the future we may implement function specific attributes to turn on
and off hardening on a per-function basis.
The fixed nature of the trampoline described above implies it will be
safer to ensure this speculation barrier is always used.

Testing:
  Bootstrap and regtest done on aarch64-none-linux
  Used a temporary hack(1) to use these options on every test in the
  testsuite and a script to check that the output never emitted an
  unmitigated RET or BR.


1) Temporary hack was a change to the testsuite to always use
`-save-temps` and run a script on the assembly output of those
compilations which produced one to ensure every RET or BR is immediately
followed by a speculation barrier.


gcc/ChangeLog:

2020-07-03  Matthew Malcomson  

* config/aarch64/aarch64-protos.h (aarch64_sls_barrier): New.
* config/aarch64/aarch64.c (aarch64_output_casesi): Emit
speculation barrier after BR instruction if needs be.
(aarch64_trampoline_init): Handle ptr_mode value & adjust size
of code copied.
(aarch64_sls_barrier): New.
(aarch64_asm_trampoline_template): Add needed barriers.
* config/aarch64/aarch64.h (AARCH64_ISA_SB): New.
(TARGET_SB): New.
(TRAMPOLINE_SIZE): Account for barrier.
* config/aarch64/aarch64.md (indirect_jump, *casesi_dispatch,
*do_return, simple_return, *sibcall_insn, *sibcall_value_insn):
Emit barrier if needs be, also account for possible barrier using
"sls_length" attribute.
(sls_length): New attribute.
(length): Determine default using any non-default sls_length
value.
* config/aarch64/aarch64.opt (-mharden-sls-retbr): Introduce new
option.

gcc/testsuite/ChangeLog:

2020-07-03  Matthew Malcomson  

* gcc.target/aarch64/sls-mitigation/sls-miti-retbr.c: New test.
* gcc.target/aarch64/sls-mitigation/sls-miti-retbr-pacret.c:
New test.
* gcc.target/aarch64/sls-mitigation/sls-mitigation.exp: New file.
* lib/target-supports.exp (check_effective_target_aarch64_asm_sb_ok):
New proc.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
8ca67d7e69edaf73c84f079e7e1c483009ad10c0..b035e4ec78e2ef1c9a931148dffacf6a50345b84
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -780,6 +780,7 @@ extern const atomic_ool_names aarch64_ool_ldeor_names;
 
 tree aarch64_resolve_overloaded_builtin_general (location_t, tree, void *);
 
+const char *aarch64_sls_barrier (int);
 extern bool aarch64_harden_sls_retbr_p (void);
 extern bool aarch64_harden_sls_blr_p (void);
 
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
2be52fd4d73def0007795159298e3d3e8fc4399d..d60d295830d3e422bb4267de275597d2087b99e6
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -281,6 +281,7 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_ISA_F32MM (aarch64_isa_flags & AARCH64_FL_F32MM)
 #define AARCH64_ISA_F64MM (aarch64_isa_flags & AARCH64_FL_F64MM)
 #define AARCH64_ISA_BF16  (aarch64_isa_flags & AARCH64_FL_BF16)
+#define AARCH64_ISA_SB(aarch64_isa_flags & AARCH64_FL_SB)
 
 /* Crypto is an optional extension to AdvSIMD.  */
 #define TARGET_CRYPTO (TARGET_SIMD && AARCH64_ISA_CRYPTO)
@@ -378,6 +379,9 @@ extern unsigned

Re: [Patch 1/3] aarch64: New Straight Line Speculation (SLS) mitigation flags

2020-07-03 Thread Matthew Malcomson

With suggestions applied.


Here we introduce the flags that will be used for straight line speculation.

The new flag introduced is `-mharden-sls=`.
This flag can take arguments of `none`, `all`, or a comma seperated list of one
or more of `retbr` or `blr`.
`none` indicates no special mitigation of the straight line speculation
vulnerability.
`all` requests all mitigations currently implemented.
`retbr` requests that the RET and BR instructions have a speculation barrier
inserted after them.
`blr` requests that BLR instructions are replaced by a BL to a function stub
using a BR with a speculation barrier after it.

Setting this on a per-function basis using attributes or the like is not
enabled, but may be in the future.

gcc/ChangeLog:

2020-07-03  Matthew Malcomson  

* config/aarch64/aarch64-protos.h (aarch64_harden_sls_retbr_p):
New.
(aarch64_harden_sls_blr_p): New.
* config/aarch64/aarch64.c (enum aarch64_sls_hardening_type):
New.
(aarch64_harden_sls_retbr_p): New.
(aarch64_harden_sls_blr_p): New.
(aarch64_validate_sls_mitigation): New.
(aarch64_override_options): Parse options for SLS mitigation.
* config/aarch64/aarch64.opt (-mharden-sls): New option.
* doc/invoke.texi: Document new option.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
9e43adb7db0373df6cc5ef1d2b22f217aca2aad2..8ca67d7e69edaf73c84f079e7e1c483009ad10c0
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -780,4 +780,7 @@ extern const atomic_ool_names aarch64_ool_ldeor_names;
 
 tree aarch64_resolve_overloaded_builtin_general (location_t, tree, void *);
 
+extern bool aarch64_harden_sls_retbr_p (void);
+extern bool aarch64_harden_sls_blr_p (void);
+
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
f3551a73d87c4e686540f39224985592c3c66fd1..b1a7c10c4eaadd78eb45926c23efc51a8272b5fd
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -14502,6 +14502,80 @@ aarch64_validate_mcpu (const char *str, const struct 
processor **res,
   return false;
 }
 
+/* Straight line speculation indicators.  */
+enum aarch64_sls_hardening_type
+{
+  SLS_NONE = 0,
+  SLS_RETBR = 1,
+  SLS_BLR = 2,
+  SLS_ALL = 3,
+};
+static enum aarch64_sls_hardening_type aarch64_sls_hardening;
+
+/* Return whether we should mitigatate Straight Line Speculation for the RET
+   and BR instructions.  */
+bool
+aarch64_harden_sls_retbr_p (void)
+{
+  return aarch64_sls_hardening & SLS_RETBR;
+}
+
+/* Return whether we should mitigatate Straight Line Speculation for the BLR
+   instruction.  */
+bool
+aarch64_harden_sls_blr_p (void)
+{
+  return aarch64_sls_hardening & SLS_BLR;
+}
+
+/* As of yet we only allow setting these options globally, in the future we may
+   allow setting them per function.  */
+static void
+aarch64_validate_sls_mitigation (const char *const_str)
+{
+  char *token_save = NULL;
+  char *str = NULL;
+
+  aarch64_sls_hardening = SLS_NONE;
+  if (strcmp (const_str, "none") == 0)
+{
+  aarch64_sls_hardening = SLS_NONE;
+  return;
+}
+  if (strcmp (const_str, "all") == 0)
+{
+  aarch64_sls_hardening = SLS_ALL;
+  return;
+}
+
+  char *str_root = xstrdup (const_str);
+  str = strtok_r (str_root, ",", _save);
+  if (!str)
+error ("invalid argument given to %<-mharden-sls=%>");
+
+  int temp = SLS_NONE;
+  while (str)
+{
+  if (strcmp (str, "blr") == 0)
+   temp |= SLS_BLR;
+  else if (strcmp (str, "retbr") == 0)
+   temp |= SLS_RETBR;
+  else if (strcmp (str, "none") == 0 || strcmp (str, "all") == 0)
+   {
+ error ("%<%s%> must be by itself for %<-mharden-sls=%>", str);
+ break;
+   }
+  else
+   {
+ error ("invalid argument %<%s%> for %<-mharden-sls=%>", str);
+ break;
+   }
+  str = strtok_r (NULL, ",", _save);
+}
+  aarch64_sls_hardening = (aarch64_sls_hardening_type) temp;
+  free (str_root);
+}
+
 /* Parses CONST_STR for branch protection features specified in
aarch64_branch_protect_types, and set any global variables required.  
Returns
the parsing result and assigns LAST_STR to the last processed token from
@@ -14746,6 +14820,9 @@ aarch64_override_options (void)
   selected_arch = NULL;
   selected_tune = NULL;
 
+  if (aarch64_harden_sls_string)
+aarch64_validate_sls_mitigation (aarch64_harden_sls_string);
+
   if (aarch64_branch_protection_string)
 aarch64_validate_mbranch_protection (aarch64_branch_protection_string);
 
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index 
d99d14c137d8774d

Re: [Patch 1/3] aarch64: New Straight Line Speculation (SLS) mitigation flags


On 23/06/2020 16:48, Richard Sandiford wrote:

Matthew Malcomson  writes:

@@ -14466,6 +14466,81 @@ aarch64_validate_mcpu (const char *str, const struct 
processor **res,
return false;
  mfix-cortex-a53-835769
  Target Report Var(aarch64_fix_a53_err835769) Init(2) Save
  Workaround for ARM Cortex-A53 Erratum number 835769.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 
35e8242af5fa4c52744fd2c3e2cfee0a617e22bb..8a3fab2964c9bb06c820766d284768751d63ac9a
 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -696,6 +696,7 @@ Objective-C and Objective-C++ Dialects}.
  -msign-return-address=@var{scope} @gol
  -mbranch-protection=@var{none}|@var{standard}|@var{pac-ret}[+@var{leaf}
  +@var{b-key}]|@var{bti} @gol
+-mharden-sls=@var{none}|@var{all}|@var{retbr}|@var{blr} @gol
  -march=@var{name}  -mcpu=@var{name}  -mtune=@var{name}  @gol
  -moverride=@var{string}  -mverbose-cost-dump @gol
  -mstack-protector-guard=@var{guard} -mstack-protector-guard-reg=@var{sysreg} 
@gol
@@ -17045,6 +17046,15 @@ functions.  The optional argument @samp{b-key} can be 
used to sign the functions
  with the B-key instead of the A-key.
  @samp{bti} turns on branch target identification mechanism.
  
+@item -mharden-sls=@var{none}|@var{all}|@var{retbr}|@var{blr}

+@opindex mharden-sls
+Enable compiler hardening against straight line speculation (SLS).
+There are two options for hardening against straight line speculation.
+@samp{retbr} allows inserting speculation barriers after every
+@samp{br} and @samp{ret} instruction.  While @samp{blr} enables replacing
+@samp{blr} instructions with a @samp{bl} to a function stub.
+@samp{all} enables all SLS hardening, while @samp{none} does not enable any.


OK, so this is even more picky, sorry, but the syntax and description
imply to me that you can choose only one of the four options.  I think
it would be more accurate to say something like:

@item -mharden-sls=@var{opts}
@opindex mharden-sls
Enable compiler hardening against straight line speculation (SLS).
@var{opts} is a comma-separated list of the following options:
@table @samp
@item retbr
…
@item blr
…
@end table
In addition, @samp{-mharden-sls=all} enables all SLS hardening
while @samp{-mharden-sls=none} disables all SLS hardening.

(assuming the above behaviour change for “none”)

Thanks,
Richard



Another "just to check": the same change should be made in the short 
form right? (i.e. the hunk above is now `-mharden-sls=@var{opts}`)

Re: [Patch 2/3] aarch64: Introduce SLS mitigation for RET and BR instructions


On 23/06/2020 17:56, Richard Sandiford wrote:

Matthew Malcomson  writes:

On 23/06/2020 17:17, Richard Sandiford wrote:

Matthew Malcomson  writes:

--- a/gcc/config/aarch64/aarch64-protos.h
+/* Ensure there are no BR or RET instructions which are not directly followed
+   by a speculation barrier.  */
+/* { dg-final { scan-assembler-not 
"\t(br|ret|retaa|retab)\tx\[0-9\]\[0-9\]?\n\t(?!dsb\tsy\n\tisb|sb)" } } */


Isn't the “sb” alternative invalid given the -march option?

Probably slightly easier to read if the regexp is quoted using {…}
rather than "…".  Same for the other tests.



Just to check before I respin:  Using {} instead of "" means I need to
replace \t with a literal tab -- do you still prefer it?


Are you sure?  We've been using tests like:

/* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.s, z[0-9]+\.s, 
z[0-9]+\.s\n} 1 } } */

for SVE without problems.  Using {…} means that backslash quoting
is applied by the regexp parser rather than the Tcl string parser,
but both should work for things like \t.

Richard



Ah -- my mistake -- I was just checking with `string compare` while 
making the change and didn't think too hard when I saw a -1.

Re: [Patch 2/3] aarch64: Introduce SLS mitigation for RET and BR instructions


On 23/06/2020 17:17, Richard Sandiford wrote:

Matthew Malcomson  writes:

--- a/gcc/config/aarch64/aarch64-protos.h
+/* Ensure there are no BR or RET instructions which are not directly followed
+   by a speculation barrier.  */
+/* { dg-final { scan-assembler-not 
"\t(br|ret|retaa|retab)\tx\[0-9\]\[0-9\]?\n\t(?!dsb\tsy\n\tisb|sb)" } } */


Isn't the “sb” alternative invalid given the -march option?

Probably slightly easier to read if the regexp is quoted using {…}
rather than "…".  Same for the other tests.



Just to check before I respin:  Using {} instead of "" means I need to 
replace \t with a literal tab -- do you still prefer it?

[Patch v2 3/3] aarch64: Mitigate SLS for BLR instruction

This patch introduces the mitigation for Straight Line Speculation past
the BLR instruction.

This mitigation replaces BLR instructions with a BL to a stub which uses
a BR to jump to the original value.  These function stubs are then
appended with a speculation barrier to ensure no straight line
speculation happens after these jumps.

When optimising for speed we use a set of stubs for each function since
this should help the branch predictor make more accurate predictions
about where a stub should branch.

When optimising for size we use one set of stubs for all functions.
This set of stubs can have human readable names, and we are using
`__call_indirect_x` for register x.

When BTI branch protection is enabled the BLR instruction can jump to a
`BTI c` instruction using any register, while the BR instruction can
only jump to a `BTI c` instruction using the x16 or x17 registers.
Hence, in order to ensure this transformation is safe we mov the value
of the original register into x16 and use x16 for the BR.

As an example when optimising for size:
a
BLR x0
instruction would get transformed to something like
BL __call_indirect_x0
where __call_indirect_x0 labels a thunk that contains
__call_indirect_x0:
MOV X16, X0
BR X16



The first version of this patch used local symbols specific to a
compilation unit to try and avoid relocations.
This was mistaken since functions coming from the same compilation unit
can still be in different sections, and the assembler will insert
relocations at jumps between sections.

On any relocation the linker is permitted to emit a veneer to handle
jumps between symbols that are very far apart.  The registers x16 and
x17 may be clobbered by these veneers.
Hence the function stubs cannot rely on the values of x16 and x17 being
the same as just before the function stub is called.

Similar can be said for the hot/cold partitioning of single functions,
so function-local stubs have the same restriction.

This updated version of the patch never emits function stubs for x16 and
x17, and instead forces other registers to be used.


Given the above, there is now no benefit to local symbols (since they
are not enough to avoid dealing with linker intricacies).  This patch
now uses global symbols with hidden visibility each stored in their own
COMDAT section.  This means stubs can be shared between compilation
units while still avoiding the PLT indirection.


This patch also removes the `__call_indirect_x30` stub (and
function-local equivalent) which would simply jump back to the original
location.


The function-local stubs are emitted to the assembly output file in one
chunk, which means we need not add the speculation barrier directly
after each one.
This is because we know for certain that the instructions directly after
the BR in all but the last function stub will be from another one of
these stubs and hence will not contain a speculation gadget.
Instead we add a speculation barrier at the end of the sequence of
stubs.

The global stubs are emitted in COMDAT/.linkonce sections by
themselves so that the linker can remove duplicates from multiple object
files.  This means they are not emitted in one chunk, and each one must
include the speculation barrier.

Another difference is that since the global stubs are shared across
compilation units we do not know that all functions will be targeting an
architecture supporting the SB instruction.
Rather than provide multiple stubs for each architecture, we provide a
stub that will work for all architectures -- using the DSB+ISB barrier.


This mitigation does not apply for BLR instructions in the following
places:
- Some accesses to thread-local variables use a code sequence with a BLR
  instruction.  This code sequence is part of the binary interface between
  compiler and linker. If this BLR instruction needs to be mitigated, it'd
  probably be best to do so in the linker. It seems that the code sequence
  for thread-local variable access is unlikely to lead to a Spectre Revalation
  Gadget.
- PLT stubs are produced by the linker and each contain a BLR instruction.
  It seems that at most only after the last PLT stub a Spectre Revalation
  Gadget might appear.

Testing:
  Bootstrap and regtest on AArch64
(with BOOT_CFLAGS="-mharden-sls=retbr,blr")
  Used a temporary hack(1) in gcc-dg.exp to use these options on every
  test in the testsuite, a slight modification to emit the speculation
  barrier after every function stub, and a script to check that the
  output never emitted a BLR, or unmitigated BR or RET instruction.
  Similar on an aarch64-none-elf cross-compiler.

1) Temporary hack emitted a speculation barrier at the end of every stub
function, and used a script to ensure that:
  a) Every RET or BR is immediately followed by a speculation barrier.
  b) No BLR instruction is emitted by compiler.


gcc/ChangeLog:

2020-06-23  Matthew Malcomson  

* config/aarch64/aarch64-protos.h (aarch64_indirec

[Patch 2/3] aarch64: Introduce SLS mitigation for RET and BR instructions

Instructions following RET or BR are not necessarily executed.  In order
to avoid speculation past RET and BR we can simply append a speculation
barrier.

Since these speculation barriers will not be architecturally executed,
they are not expected to add a high performance penalty.

The speculation barrier is to be SB when targeting architectures which
have this enabled, and DSB SY + ISB otherwise.

We add tests for each of the cases where such an instruction was seen.

This is implemented by modifying each machine description pattern that
emits either a RET or a BR instruction.  We choose not to use something
like `TARGET_ASM_FUNCTION_EPILOGUE` since it does not affect the
`indirect_jump`, `jump`, `sibcall_insn` and `sibcall_value_insn`
patterns and we find it preferable to implement the functionality in the
same way for every pattern.

There is one particular case which is slightly tricky.  The
implementation of TARGET_ASM_TRAMPOLINE_TEMPLATE uses a BR which needs
to be mitigated against.  The trampoline template is used *once* per
compilation unit, and the TRAMPOLINE_SIZE is exposed to the user via the
builtin macro __LIBGCC_TRAMPOLINE_SIZE__.
In the future we may implement function specific attributes to turn on
and off hardening on a per-function basis.
The fixed nature of the trampoline described above implies it will be
safer to ensure this speculation barrier is always used.

Testing:
  Bootstrap and regtest done on aarch64-none-linux
  Used a temporary hack(1) to use these options on every test in the
  testsuite and a script to check that the output never emitted an
  unmitigated RET or BR.


1) Temporary hack was a change to the testsuite to always use
`-save-temps` and run a script on the assembly output of those
compilations which produced one to ensure every RET or BR is immediately
followed by a speculation barrier.


gcc/ChangeLog:

2020-06-08  Matthew Malcomson  

* config/aarch64/aarch64-protos.h (aarch64_sls_barrier): New.
* config/aarch64/aarch64.c (aarch64_output_casesi): Emit
speculation barrier after BR instruction if needs be.
(aarch64_sls_barrier): New.
(aarch64_asm_trampoline_template): Add needed barriers.
* config/aarch64/aarch64.h (AARCH64_ISA_SB): New.
(TARGET_SB): New.
(TRAMPOLINE_SIZE): Account for barrier.
* config/aarch64/aarch64.md (indirect_jump, *casesi_dispatch,
*do_return, simple_return, *sibcall_insn, *sibcall_value_insn):
Emit barrier if needs be, also account for possible barrier in
"length" attribute.
* config/aarch64/aarch64.opt (-mharden-sls-retbr): Introduce new
option.

gcc/testsuite/ChangeLog:

2020-06-08  Matthew Malcomson  

* gcc.target/aarch64/sls-mitigation/sls-miti-retbr.c: New test.
* gcc.target/aarch64/sls-mitigation/sls-miti-retbr-pacret.c:
New test.
* gcc.target/aarch64/sls-mitigation/sls-mitigation.exp: New file.
* lib/target-supports.exp (check_effective_target_aarch64_asm_sb_ok):
New proc.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
8ca67d7e69edaf73c84f079e7e1c483009ad10c0..d2eb739bc89ecd9d0212416b8dc3ee4ba236a271
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -780,6 +780,7 @@ extern const atomic_ool_names aarch64_ool_ldeor_names;
 
 tree aarch64_resolve_overloaded_builtin_general (location_t, tree, void *);
 
+const char * aarch64_sls_barrier (int);
 extern bool aarch64_harden_sls_retbr_p (void);
 extern bool aarch64_harden_sls_blr_p (void);
 
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
24767c747bab0d711627c5c646937c42f210d70b..5da3d94e335fc315e1d90e6a674f2f09cf1a4529
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -281,6 +281,7 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_ISA_F32MM (aarch64_isa_flags & AARCH64_FL_F32MM)
 #define AARCH64_ISA_F64MM (aarch64_isa_flags & AARCH64_FL_F64MM)
 #define AARCH64_ISA_BF16  (aarch64_isa_flags & AARCH64_FL_BF16)
+#define AARCH64_ISA_SB(aarch64_isa_flags & AARCH64_FL_SB)
 
 /* Crypto is an optional extension to AdvSIMD.  */
 #define TARGET_CRYPTO (TARGET_SIMD && AARCH64_ISA_CRYPTO)
@@ -378,6 +379,9 @@ extern unsigned aarch64_architecture_version;
 #define TARGET_FIX_ERR_A53_835769_DEFAULT 1
 #endif
 
+/* SB instruction is enabled through +sb.  */
+#define TARGET_SB (AARCH64_ISA_SB)
+
 /* Apply the workaround for Cortex-A53 erratum 835769.  */
 #define TARGET_FIX_ERR_A53_835769  \
   ((aarch64_fix_a53_err835769 == 2)\
@@ -1058,8 +1062,11 @@ typedef struct
 
 #define RETURN_ADDR_RTX aarch64_return_addr
 
-/* BTI c + 3 insns + 2 pointer-sized entries.  */
-#define TRAMPOLINE_SIZE(TARGET_ILP32 ? 24 :

[Patch 3/3] aarch64: Mitigate SLS for BLR instruction

This patch introduces the mitigation for Straight Line Speculation past
the BLR instruction.

This mitigation replaces BLR instructions with a BL to a stub which
simply consists of a BR to the original register.  These function stubs
are then appended with a speculation barrier to ensure no straight line
speculation happens after these jumps.

When optimising for speed we use a set of stubs for each function since
this should help the branch predictor make more accurate predictions
about where a stub should branch.

When optimising for size we use one set of stubs for the entire
compilation unit.
This set of stubs can have human readable names, and we are currently
using `__call_indirect_x` for register x.

As an example when optimising for size:
a
BLR x0
instruction would get transformed to
BL __call_indirect_x0
with __call_indirect_x0 labelling a thunk that contains
__call_indirect_x0:
BR X0
speculation barrier

Since we add these function stubs to the assembly output all in one
chunk, we need not add the speculation barrier directly after each one.
This is because we know for certain that the instructions directly after
the BR in all but the last function stub will be from another one of
these stubs and hence will not contain a speculation gadget.
Instead we add a speculation barrier at the end of the sequence of
stubs.

Special care needs to be given to this transformation occuring in
a context where BTI is enabled.  A BLR can jump to a `BTI c` target,
while a BR can only jump to a `BTI c` target if it uses the registers
x16 or x17.
Hence we use constraints to limit the registers used when this
transformation is being made in an environment that uses BTI.

This mitigation does not apply for BLR instructions in the following
places:
- Some accesses to thread-local variables use a code sequence with a BLR
  instruction.  This code sequence is part of the binary interface between
  compiler and linker. If this BLR instruction needs to be mitigated, it'd
  probably be best to do so in the linker. It seems that the code sequence
  for thread-local variable access is unlikely to lead to a Spectre Revalation
  Gadget.
- PLT stubs are produced by the linker and each contain a BLR instruction.
  It seems that at most only after the last PLT stub a Spectre Revalation
  Gadget might appear.

Testing:
  Bootstrap and regtest on AArch64
(with BOOT_CFLAGS="-mharden-sls=retbr,blr")
  Used a temporary hack(1) in gcc-dg.exp to use these options on every
  test in the testsuite, a slight modification to emit the speculation
  barrier after every function stub, and a script to check that the
  output never emitted a BLR, or unmitigated BR or RET instruction.
  Similar on an aarch64-none-elf cross-compiler.

1) Temporary hack emitted a speculation barrier at the end of every stub
function, and used a script to ensure that:
  a) Every RET or BR is immediately followed by a speculation barrier.
  b) No BLR instruction is emitted by compiler.


gcc/ChangeLog:

2020-06-08  Matthew Malcomson  

* config/aarch64/aarch64-protos.h (aarch64_indirect_call_asm):
New declaration.
* config/aarch64/aarch64.c (aarch64_use_return_insn_p): Return
false if hardening BLR instructions.
(aarch64_sls_shared_thunks): Global array to store stub labels.
(aarch64_create_blr_label): New.
(print_asm_branch): New macro.
(aarch64_sls_emit_blr_function_thunks): New.
(aarch64_sls_emit_shared_blr_thunks): New.
(aarch64_asm_file_end): New.
(aarch64_indirect_call_asm): New.
(TARGET_ASM_FILE_END): Use aarch64_asm_file_end.
(TARGET_ASM_FUNCTION_EPILOGUE): Use
aarch64_sls_emit_blr_function_thunks.
* config/aarch64/aarch64.h (struct machine_function): Introduce
`call_via` array to store function-local stub labels.
* config/aarch64/aarch64.md (*call_insn, *call_value_insn): Use
aarch64_indirect_call_asm to emit code when hardening BLR
instructions.

gcc/testsuite/ChangeLog:

2020-06-08  Matthew Malcomson  

* gcc.target/aarch64/sls-mitigation/sls-miti-blr-bti.c: New test.
* gcc.target/aarch64/sls-mitigation/sls-miti-blr.c: New test.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
d2eb739bc89ecd9d0212416b8dc3ee4ba236a271..e79f9cbc783e75132e999395ff975f9768436419
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -781,6 +781,7 @@ extern const atomic_ool_names aarch64_ool_ldeor_names;
 tree aarch64_resolve_overloaded_builtin_general (location_t, tree, void *);
 
 const char * aarch64_sls_barrier (int);
+const char * aarch64_indirect_call_asm (rtx);
 extern bool aarch64_harden_sls_retbr_p (void);
 extern bool aarch64_harden_sls_blr_p (void);
 
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/conf

[Patch 1/3] aarch64: New Straight Line Speculation (SLS) mitigation flags

Here we introduce the flags that will be used for straight line speculation.

The new flag introduced is `-mharden-sls=`.
This flag can take arguments of `none`, `all`, or a comma seperated list of one
or more of `retbr` or `blr`.
`none` indicates no special mitigation of the straight line speculation
vulnerability.
`all` requests all mitigations currently implemented.
`retbr` requests that the RET and BR instructions have a speculation barrier
inserted after them.
`blr` requests that BLR instructions are replaced by a BL to a function stub
using a BR with a speculation barrier after it.

Setting this on a per-function basis using attributes or the like is not
enabled, but may be in the future.

gcc/ChangeLog:

2020-06-08  Matthew Malcomson  

* config/aarch64/aarch64-protos.h (aarch64_harden_sls_retbr_p):
New.
(aarch64_harden_sls_blr_p): New.
* config/aarch64/aarch64.c (enum aarch64_sls_hardening_type):
New.
(aarch64_harden_sls_retbr_p): New.
(aarch64_harden_sls_blr_p): New.
(aarch64_validate_sls_mitigation): New.
(aarch64_override_options): Parse options for SLS mitigation.
* config/aarch64/aarch64.opt (-mharden-sls): New option.
* doc/invoke.texi: Document new option.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
9e43adb7db0373df6cc5ef1d2b22f217aca2aad2..8ca67d7e69edaf73c84f079e7e1c483009ad10c0
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -780,4 +780,7 @@ extern const atomic_ool_names aarch64_ool_ldeor_names;
 
 tree aarch64_resolve_overloaded_builtin_general (location_t, tree, void *);
 
+extern bool aarch64_harden_sls_retbr_p (void);
+extern bool aarch64_harden_sls_blr_p (void);
+
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
e92c7e69fcb7a8689a8b7098b86ff050dc9ab78b..775f49991e5f599a843d3ef490b8cd044acfe78f
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -14466,6 +14466,81 @@ aarch64_validate_mcpu (const char *str, const struct 
processor **res,
   return false;
 }
 
+
+/* Straight line speculation indicators.  */
+enum aarch64_sls_hardening_type
+{
+SLS_NONE = 0,
+SLS_RETBR = 1,
+SLS_BLR = 2,
+SLS_ALL = 3,
+};
+static enum aarch64_sls_hardening_type aarch64_sls_hardening;
+/* Return whether we should mitigatate Straight Line Speculation for the RET
+   and BR instructions.  */
+bool
+aarch64_harden_sls_retbr_p (void)
+{
+  return aarch64_sls_hardening & SLS_RETBR;
+}
+/* Return whether we should mitigatate Straight Line Speculation for the RET
+   and BR instructions.  */
+bool
+aarch64_harden_sls_blr_p (void)
+{
+  return aarch64_sls_hardening & SLS_BLR;
+}
+
+/* As of yet we only allow setting these options globally, in the future we may
+   allow setting them per function.  */
+static void
+aarch64_validate_sls_mitigation (const char *const_str)
+{
+  char *str_root = xstrdup (const_str);
+  char *token_save = NULL;
+  char *str = NULL;
+  int temp = SLS_NONE;
+
+  aarch64_sls_hardening = SLS_NONE;
+  if (strcmp (str_root, "none") == 0)
+goto finish;
+  if (strcmp (str_root, "all") == 0)
+{
+  aarch64_sls_hardening = SLS_ALL;
+  goto finish;
+}
+
+  str = strtok_r (str_root, ",", _save);
+  if (!str)
+{
+  error ("invalid argument given to %<-mharden-sls=%>");
+  goto finish;
+}
+
+  while (str)
+{
+  if (strcmp (str, "blr") == 0)
+   temp |= SLS_BLR;
+  else if (strcmp (str, "retbr") == 0)
+   temp |= SLS_RETBR;
+  else if (strcmp (str, "none") == 0 || strcmp (str, "all") == 0)
+   {
+ error ("%<%s%> must be by itself for %<-mharden-sls=%>", str);
+ break;
+   }
+  else
+   {
+ error ("invalid argument %<%s%> for %<-mharden-sls=%>", str);
+ break;
+   }
+  str = strtok_r (NULL, ",", _save);
+}
+  aarch64_sls_hardening = (aarch64_sls_hardening_type) temp;
+finish:
+  free (str_root);
+  return;
+}
+
 /* Parses CONST_STR for branch protection features specified in
aarch64_branch_protect_types, and set any global variables required.  
Returns
the parsing result and assigns LAST_STR to the last processed token from
@@ -14710,6 +14785,9 @@ aarch64_override_options (void)
   selected_arch = NULL;
   selected_tune = NULL;
 
+  if (aarch64_harden_sls_string)
+  aarch64_validate_sls_mitigation (aarch64_harden_sls_string);
+
   if (aarch64_branch_protection_string)
 aarch64_validate_mbranch_protection (aarch64_branch_protection_string);
 
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index 
d99d14c137d8774d

Straight Line Speculation (SLS) mitigation.

Hi,

A new speculative cache side-channel vulnerability has been published at
the link below, named "straight-line speculation" (SLS in this patch series).
https://developer.arm.com/support/arm-security-updates/speculative-processor-vulnerability/downloads/straight-line-speculation
This vulnerability has been given CVE number CVE-2020-13844.

We have prepared some toolchain mitigations for this vulnerability. These
mitigate against the RET, BR case and the BLR case mentioned in the linked
whitepaper.

The part of vulnerability relevant to these toolchain mitigations is as
follows:
Some processors may speculatively execute the instructions immediately
following what should be a change in control flow. The examples we mitigate in
this patch series are the instructions RET (return), BR (indirect jump) and BLR
(indirect function call).
Where the speculative path contains a suitable code sequence, often described
by researchers as a "Spectre Revelation Gadget", such straight-line speculation
could lead to changes in the caches and similar structures that are indicative
of secrets, making those secrets vulnerable to revelation through timing
analysis.

The gist of the mitigation posted here is:

Every RET and BR instruction has a speculation barrier placed directly after
it. These speculation barriers are not to be architecturally executed, so the
performance cost is expected to be low.

Each BLR instruction is replaced by a BL to a function stub consisting of a BR
instruction followed by a speculation barrier.
This alternate approach is used since the instructions directly after a BLR are
usually architecturally executed, and this approach ensures the speculation
barrier is off that architecturally executed path.
Arm has been unable to demonstrate straight line speculation past a BL or B
instruction, and so we believe the BL instruction can be used without a
barrier.

In summary, a
RET
will be transformed to
RET

While a
BLR x
will be transformed to a
BL __call_indirect_x
call, with __call_indirect_x being a thunk that looks like
__call_indirect_x:
BR x

The patch series is structured as follows:
1) Introduce new command line arguments.
2) The RET/BR mitigation.
3) The BLR mitigation.

There are a few known places where this toolchain mitigation does not protect
against speculation places:
- Some accesses to thread-local variables use a code sequence including a BLR
instruction. This code sequence is part of the binary interface between
compiler and linker. If this BLR instruction needs to be mitigated, it'd
probably be best to do so in the linker.
It seems that the code sequence for thread-local variable access is unlikely
to lead to a Spectre Revalation Gadget.
- PLT stubs are produced by the linker, and each contain a BLR instruction.
It seems that at most this could introduce one spectre relevation gadget
after the last PLT stub.
- Use of BR, RET, or BLR instructions in assembly are not mitigated.
- Use of BR, RET, or BLR instructions in libraries and run-time library
routines that are not recompiled with this toolchain mitigation are not
mitigated.

N.b. patches with similar functionality are being posted to LLVM.

Thanks,
Matthew.

Entire patch series attached to cover letter.

all-patches.tar.gz
Description: application/gzip

[Arm] Account for C++17 artificial field determining Homogeneous Aggregates

2020-04-27 Thread Matthew Malcomson

NOTE:
  There is another patch in the making to handle the APCS abi (selected with
  `-mabi=apcs-gnu`).  That patch has a very different change in behaviour and
  is in a different part of the code so I'm keeping it in a different patch.

In C++14, an empty class deriving from an empty base is not an
aggregate, while in C++17 it is.  In order to implement this, GCC adds
an artificial field to such classes.

This artificial field has no mapping to Fundamental Data Types in the
Arm PCS ABI and hence should not count towards determining whether an
object can be passed using the vector registers as per section
"7.1.2 Procedure Calling" in the arm PCS
https://developer.arm.com/docs/ihi0042/latest?_ga=2.60211309.1506853196.1533541889-405231439.1528186050

This patch avoids counting this artificial field in
aapcs_vfp_sub_candidate, and hence calculates whether such objects
should be passed in vector registers in the same manner as C++14 (where
the artificial field does not exist).

Before this change, the test below would pass the arguments to `f` in
general registers.  After this change, the test passes the arguments to
`f` using the vector registers.

The new behaviour matches the behaviour of `armclang`, and also matches
the GCC behaviour when run with `-std=gnu++14`.

> gcc -std=gnu++17 -march=armv8-a+simd -mfloat-abi=hard test.cpp

``` test.cpp
struct base {};

struct pair : base
{
  float first;
  float second;
  pair (float f, float s) : first(f), second(s) {}
};

void f (pair);
int main()
{
  f({3.14, 666});
  return 1;
}
```

We add a `-Wpsabi` warning to catch cases where this fix has changed the ABI for
some functions.  Unfortunately this warning is not emitted twice for multiple
calls to the same function, but I feel this is not much of a problem and can be
fixed later if needs be.

(i.e. if `main` called `f` twice in a row we only emit a diagnostic for the
first).

Testing:
Bootstrapped and regression tested on arm-linux.
This change fixes the struct-layout-1 tests Jakub added
https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544204.html
Regression tested on arm-none-eabi.

gcc/ChangeLog:

2020-04-27  Matthew Malcomson  
Jakub Jelinek  

PR target/94383
* config/arm/arm.c (aapcs_vfp_sub_candidate): Account for C++17 empty
base class artificial fields.
(aapcs_vfp_is_call_or_return_candidate): Warn when PCS ABI
decision is different after this fix.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
0151bda90d961ae1a001c61cd5e94d6ec67e3aea..0db444c3fdb2ba6fb7ebad410310fb5b1bc0e304
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -6139,9 +6139,19 @@ aapcs_vfp_cum_init (CUMULATIVE_ARGS *pcum  
ATTRIBUTE_UNUSED,
If *MODEP is VOIDmode, then set it to the first valid floating point
type.  If a non-floating point type is found, or if a floating point
type that doesn't match a non-VOIDmode *MODEP is found, then return -1,
-   otherwise return the count in the sub-tree.  */
+   otherwise return the count in the sub-tree.
+
+   The AVOID_CXX17_EMPTY_BASE argument is to allow the caller to check whether
+   this function has changed its behavior after the fix for PR94384 -- this fix
+   is to avoid artificial fields in empty base classes.
+   When called with this argument as a NULL pointer this function does not
+   avoid the artificial fields -- this is useful to check whether the function
+   returns something different after the fix.
+   When called pointing at a value, this function avoids such artificial fields
+   and sets the value to TRUE when one of these fields has been set.  */
 static int
-aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep)
+aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep,
+bool *avoid_cxx17_empty_base)
 {
   machine_mode mode;
   HOST_WIDE_INT size;
@@ -6213,7 +6223,8 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode 
*modep)
|| TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
  return -1;
 
-   count = aapcs_vfp_sub_candidate (TREE_TYPE (type), modep);
+   count = aapcs_vfp_sub_candidate (TREE_TYPE (type), modep,
+avoid_cxx17_empty_base);
if (count == -1
|| !index
|| !TYPE_MAX_VALUE (index)
@@ -6251,7 +6262,18 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode 
*modep)
if (TREE_CODE (field) != FIELD_DECL)
  continue;
 
-   sub_count = aapcs_vfp_sub_candidate (TREE_TYPE (field), modep);
+   /* Ignore C++17 empty base fields, while their type indicates
+  they do contain padding, they have zero size and thus don't
+  contain any padding.  */
+   if (cxx17_empty_base_field_p (field)
+   && a

[AArch64] (PR94383) Avoid C++17 empty base field checking for HVA/HFA

2020-04-23 Thread Matthew Malcomson

In C++17, an empty class deriving from an empty base is not an
aggregate, while in C++14 it is.  In order to implement this, GCC adds
an artificial field to such classes.

This artificial field has no mapping to Fundamental Data Types in the
AArch64 PCS ABI and hence should not count towards determining whether an
object can be passed using the vector registers as per section
"6.4.2 Parameter Passing Rules" in the AArch64 PCS.
https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#the-base-procedure-call-standard

This patch avoids counting this artificial field in
aapcs_vfp_sub_candidate, and hence calculates whether such objects
should be passed in vector registers in the same manner as C++14 (where
the artificial field does not exist).

Before this change, the test below would pass the arguments to `f` in
general registers.  After this change, the test passes the arguments to
`f` using the vector registers.

The new behaviour matches the behaviour of `armclang`, and also matches
the behaviour when run with `-std=gnu++14`.

> gcc -std=gnu++17 test.cpp

``` test.cpp
struct base {};

struct pair : base
{
  float first;
  float second;
  pair (float f, float s) : first(f), second(s) {}
};

void f (pair);
int main()
{
  f({3.14, 666});
  return 1;
}
```

We add a `-Wpsabi` warning to catch cases where this fix has changed the ABI for
some functions.  Unfortunately this warning is not emitted twice for multiple
calls to the same function, but I feel this is not much of a problem and can be
fixed later if needs be.

(i.e. if `main` called `f` twice in a row we only emit a diagnostic for the
first).

Testing:
Minimal testing done just to demonstrate new warning message and to check
tests related to this still pass.


gcc/ChangeLog:

2020-04-23  Matthew Malcomson  
Jakub Jelinek  

PR target/94383
* config/aarch64/aarch64.c (aapcs_vfp_sub_candidate): Account for C++17
empty base class artificial fields.
(aarch64_vfp_is_call_or_return_candidate): Warn when ABI PCS decision is
different after this fix.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
f728ac530a4ac968de16ea59d2b724ed63b23d6f..25f60ec44ae3a177d0aceb4bae92ea17da91ba05
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -16442,9 +16442,19 @@ aarch64_member_type_forces_blk (const_tree 
field_or_array, machine_mode mode)
If *MODEP is VOIDmode, then set it to the first valid floating point
type.  If a non-floating point type is found, or if a floating point
type that doesn't match a non-VOIDmode *MODEP is found, then return -1,
-   otherwise return the count in the sub-tree.  */
+   otherwise return the count in the sub-tree.
+
+   The AVOID_CXX17_EMPTY_BASE argument is to allow the caller to check whether
+   this function has changed its behavior after the fix for PR94384 -- this fix
+   is to avoid artificial fields in empty base classes.
+   When called with this argument as a NULL pointer this function does not
+   avoid the artificial fields -- this is useful to check whether the function
+   returns something different after the fix.
+   When called pointing at a value, this function avoids such artificial fields
+   and sets the value to TRUE when one of these fields has been set.  */
 static int
-aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep)
+aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep,
+bool *avoid_cxx17_empty_base)
 {
   machine_mode mode;
   HOST_WIDE_INT size;
@@ -16520,7 +16530,8 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode 
*modep)
|| TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
  return -1;
 
-   count = aapcs_vfp_sub_candidate (TREE_TYPE (type), modep);
+   count = aapcs_vfp_sub_candidate (TREE_TYPE (type), modep,
+avoid_cxx17_empty_base);
if (count == -1
|| !index
|| !TYPE_MAX_VALUE (index)
@@ -16558,7 +16569,18 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode 
*modep)
if (TREE_CODE (field) != FIELD_DECL)
  continue;
 
-   sub_count = aapcs_vfp_sub_candidate (TREE_TYPE (field), modep);
+   /* Ignore C++17 empty base fields, while their type indicates
+  they do contain padding, they have zero size and thus don't
+  contain any padding.  */
+   if (cxx17_empty_base_field_p (field)
+   && avoid_cxx17_empty_base)
+ {
+   *avoid_cxx17_empty_base = true;
+   continue;
+ }
+
+   sub_count = aapcs_vfp_sub_candidate (TREE_TYPE (field), modep,
+avoid_cxx17_empty_base);
if (sub_count < 0)
  retu

[AArch64] (PR94383) Avoid C++17 empty base field checking for HVA/HFA

2020-04-21 Thread Matthew Malcomson

In C++17, an empty class deriving from an empty base is not an
aggregate, while in C++14 it is.  In order to implement this, GCC adds
an artificial field to such classes.

This artificial field has no mapping to Fundamental Data Types in the
AArch64 PCS ABI and hence should not count towards determining whether an
object can be passed using the vector registers as per section
"6.4.2 Parameter Passing Rules" in the AArch64 PCS.
https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#the-base-procedure-call-standard

This patch avoids counting this artificial field in
aapcs_vfp_sub_candidate, and hence calculates whether such objects
should be passed in vector registers in the same manner as C++14 (where
the artificial field does not exist).

Before this change, the test below would pass the arguments to `f` in
general registers.  After this change, the test passes the arguments to
`f` using the vector registers.

The new behaviour matches the behaviour of `armclang`, and also matches
the behaviour when run with `-std=gnu++14`.

> gcc -std=gnu++17 test.cpp

``` test.cpp
struct base {};

struct pair : base
{
  float first;
  float second;
  pair (float f, float s) : first(f), second(s) {}
};

void f (pair);
int main()
{
  f({3.14, 666});
  return 1;
}
```

We add a `-Wpsabi` warning to catch cases where this fix has changed the ABI for
some functions.  Unfortunately this warning is currently emitted multiple times
for each problem, but I feel this is not much of a problem and can be fixed
later if needs be.


Testing:
Bootstrapped on aarch64-linux.
Jakub has provided a testsuite change that catches the original problem.
https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544204.html
This patch makes that testcase pass.

gcc/ChangeLog:

2020-04-21  Matthew Malcomson  
Jakub Jelinek  

PR target/94383
* config/aarch64/aarch64.c (enum cpp17empty_state): New.
(aapcs_vfp_sub_candidate): Account for C++17 empty base class
artificial fields.
(aarch64_vfp_is_call_or_return_candidate): Warn when ABI PCS decision is
different after this fix.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
c90de65de127992cc0bd9603a32684ebb5bd4fdf..f9f40d52c846afae0d8d2cd9ed4737cdc4b88896
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -15909,13 +15909,30 @@ aarch64_conditional_register_usage (void)
 }
 }
 
+enum cpp17empty_state {
+DONT_AVOID = 0,
+AVOID = 1,
+AVOID_DONE = 3
+};
+
 /* Walk down the type tree of TYPE counting consecutive base elements.
If *MODEP is VOIDmode, then set it to the first valid floating point
type.  If a non-floating point type is found, or if a floating point
type that doesn't match a non-VOIDmode *MODEP is found, then return -1,
-   otherwise return the count in the sub-tree.  */
+   otherwise return the count in the sub-tree.
+
+   The AVOID_C17EMPTY_FIELD argument is to allow the caller to check whether
+   this function has changed its behaviour after the fix for PR94384 -- this
+   fix is to avoid artificial fields in empty base classes.
+   When called pointing at a value of DONT_AVOID then this function does not
+   avoid the artificial fields -- this is useful to check whether the function
+   returns something different after the fix.
+   When called pointing at a value which has the AVOID bit set, this function
+   avoids such artificial fields and sets the value to AVOID_DONE when one of
+   these fields has been set.  */
 static int
-aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep)
+aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep,
+enum cpp17empty_state *avoid_c17empty_field)
 {
   machine_mode mode;
   HOST_WIDE_INT size;
@@ -15992,7 +16009,8 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode 
*modep)
|| TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST)
  return -1;
 
-   count = aapcs_vfp_sub_candidate (TREE_TYPE (type), modep);
+   count = aapcs_vfp_sub_candidate (TREE_TYPE (type), modep,
+avoid_c17empty_field);
if (count == -1
|| !index
|| !TYPE_MAX_VALUE (index)
@@ -16030,7 +16048,22 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode 
*modep)
if (TREE_CODE (field) != FIELD_DECL)
  continue;
 
-   sub_count = aapcs_vfp_sub_candidate (TREE_TYPE (field), modep);
+   /* Ignore C++17 empty base fields, while their type indicates
+  they do contain padding, they have zero size and thus don't
+  contain any padding.  */
+   if (DECL_ARTIFICIAL (field)
+   && DECL_NAME (field) == NULL_TREE
+   && RECORD_OR_UNION_TYPE_P (TREE_TYPE (field))
+

Re: [PATCH 1/2] testsuite: [arm/cde] Include arm_cde.h and arm_mve.h in arm_v8m_main_cde

2020-04-20 Thread Matthew Malcomson


On 20/04/2020 08:47, Christophe Lyon via Gcc-patches wrote:

Since arm_cde.h includes stdint.h, its use requires the presence of
the right gnu/stub-*.h, so make sure to include it when checking the
arm_v8*m_main_cde* effective targets, otherwise we can decide CDE is
supported while it's not really (all tests that use arm_v8m_main_cde*
also include arm_cde.h aynway).

Similarly for the effective targets that also require MVE.

This makes several tests unsupported rather than fail.


Hi Christophe,

This looks like a good idea to me -- though I'm not a maintainer so that 
means little ;-)


I just wanted to ask -- is this needed mainly if testing without a fully 
functional C99 environment?
I would imagine that a fully set up environment would have some sort of 
`stdint.h` available, even if not using the glibc gnu/stub-*.h files 
(e.g. using the newlib headers).


Cheers,
Matthew


---
  gcc/testsuite/lib/target-supports.exp | 10 --
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index d16498d..23a5abf 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -5140,21 +5140,25 @@ proc add_options_for_arm_v8_2a_bf16_neon { flags } {
  #   /* { dg-add-options arm_v8m_main_cde } */
  # The tests are valid for Arm.
  
-foreach { armfunc armflag armdef } {

+foreach { armfunc armflag armdef arminc } {
arm_v8m_main_cde
"-march=armv8-m.main+cdecp0+cdecp6 -mthumb"
"defined (__ARM_FEATURE_CDE)"
+   ""
arm_v8m_main_cde_fp
"-march=armv8-m.main+fp+cdecp0+cdecp6 -mthumb -mfpu=auto"
"defined (__ARM_FEATURE_CDE) && defined (__ARM_FP)"
+   ""
arm_v8_1m_main_cde_mve
"-march=armv8.1-m.main+mve+cdecp0+cdecp6 -mthumb -mfpu=auto"
"defined (__ARM_FEATURE_CDE) && defined (__ARM_FEATURE_MVE)"
+   "#include "
arm_v8_1m_main_cde_mve_fp
"-march=armv8.1-m.main+mve.fp+cdecp0+cdecp6 -mthumb -mfpu=auto"
"defined (__ARM_FEATURE_CDE) || __ARM_FEATURE_MVE == 3"
+   "#include "
} {
-eval [string map [list FUNC $armfunc FLAG $armflag DEF $armdef ] {
+eval [string map [list FUNC $armfunc FLAG $armflag DEF $armdef INC $arminc 
] {
proc check_effective_target_FUNC_ok_nocache { } {
global et_FUNC_flags
set et_FUNC_flags ""
@@ -5167,6 +5171,8 @@ foreach { armfunc armflag armdef } {
#if !(DEF)
#error "DEF failed"
#endif
+   #include 
+   INC
} "FLAG"] } {
set et_FUNC_flags "FLAG"
return 1

[Arm] Disallow arm_movdi when targetting MVE

2020-04-15 Thread Matthew Malcomson

Without disabling this, the pattern can try and move DImode values
between floating point registers and general registers.
The constraints on this pattern can't handle that, and reload goes into
an infinite loop.

This was the cause of a testsuite failure in cde_v_1_mve.c, which is now
gone.

A DImode move for MVE now uses the `movdi_vfp` pattern, which is the same
pattern used for such a move when MVE is not available but the target has
TARGET_HARD_FLOAT.

Testing done:
Bootstrapped and regtested on arm-none-linux-gnueabihf
regtested on arm-none-eabi

gcc/ChangeLog:

2020-04-15  Matthew Malcomson  

* config/arm/arm.md (arm_movdi): Disallow for MVE.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
7bc55cce61b2e45e5875a233dd4546d59399d749..a6a31f8f4ef8d3c8c96c61b450dbd7c28d08b55c
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -6233,6 +6233,7 @@
(match_operand:DI 1 "di_operand"  "rDa,Db,Dc,mi,r"))]
   "TARGET_32BIT
&& !(TARGET_HARD_FLOAT)
+   && !(TARGET_HAVE_MVE || TARGET_HAVE_MVE_FLOAT)
&& !TARGET_IWMMXT
&& (   register_operand (operands[0], DImode)
|| register_operand (operands[1], DImode))"

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
7bc55cce61b2e45e5875a233dd4546d59399d749..a6a31f8f4ef8d3c8c96c61b450dbd7c28d08b55c
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -6233,6 +6233,7 @@
(match_operand:DI 1 "di_operand"  "rDa,Db,Dc,mi,r"))]
   "TARGET_32BIT
&& !(TARGET_HARD_FLOAT)
+   && !(TARGET_HAVE_MVE || TARGET_HAVE_MVE_FLOAT)
&& !TARGET_IWMMXT
&& (   register_operand (operands[0], DImode)
|| register_operand (operands[1], DImode))"

[Arm] Allow the use of arm_cde.h for C++

2020-04-09 Thread Matthew Malcomson

arm_cde.h includes the arm_mve_types.h header, which declares some C++
overloaded functions.

There is a superfluous `extern "C"` statement in arm_cde.h, which
encompasses these functions.  This means that if compiling for C++, the
overloaded functions are declared, but are declared without name
mangling.  Hence all the function names are the same and we have many
conflicting declarations.

Testing Done:
  Regression tested for arm-none-eabi.

gcc/ChangeLog:

2020-04-09  Matthew Malcomson  

* config/arm/arm_cde.h: Remove `extern "C"` when compiling for
C++.

gcc/testsuite/ChangeLog:

2020-04-09  Matthew Malcomson  

* g++.target/arm/cde_mve.C: New test.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm_cde.h b/gcc/config/arm/arm_cde.h
index 
d8ddda6bd648d8b94e97f7b6403b7708cebc9eb3..0ba3ee02d057dd280c9b3400ef70b46ed472aec3
 100644
--- a/gcc/config/arm/arm_cde.h
+++ b/gcc/config/arm/arm_cde.h
@@ -27,10 +27,6 @@
 #ifndef _GCC_ARM_CDE_H
 #define _GCC_ARM_CDE_H 1
 
-#ifdef __cplusplus
-extern "C" {
-#endif
-
 #include 
 
 #if defined (__ARM_FEATURE_CDE)
@@ -177,8 +173,4 @@ extern "C" {
 
 #endif
 
-#ifdef __cplusplus
-}
-#endif
-
 #endif
diff --git a/gcc/testsuite/g++.target/arm/cde_mve.C 
b/gcc/testsuite/g++.target/arm/cde_mve.C
new file mode 100644
index 
..897cbd2b8119471385c1c51158b1aa4851acbaf1
--- /dev/null
+++ b/gcc/testsuite/g++.target/arm/cde_mve.C
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_main_cde_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_main_cde_mve_fp } */
+
+/* Ensure this compiles.  */
+#include "arm_cde.h"
+int foo ()
+{
+  return 1;
+}

diff --git a/gcc/config/arm/arm_cde.h b/gcc/config/arm/arm_cde.h
index 
d8ddda6bd648d8b94e97f7b6403b7708cebc9eb3..0ba3ee02d057dd280c9b3400ef70b46ed472aec3
 100644
--- a/gcc/config/arm/arm_cde.h
+++ b/gcc/config/arm/arm_cde.h
@@ -27,10 +27,6 @@
 #ifndef _GCC_ARM_CDE_H
 #define _GCC_ARM_CDE_H 1
 
-#ifdef __cplusplus
-extern "C" {
-#endif
-
 #include 
 
 #if defined (__ARM_FEATURE_CDE)
@@ -177,8 +173,4 @@ extern "C" {
 
 #endif
 
-#ifdef __cplusplus
-}
-#endif
-
 #endif
diff --git a/gcc/testsuite/g++.target/arm/cde_mve.C 
b/gcc/testsuite/g++.target/arm/cde_mve.C
new file mode 100644
index 
..897cbd2b8119471385c1c51158b1aa4851acbaf1
--- /dev/null
+++ b/gcc/testsuite/g++.target/arm/cde_mve.C
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_main_cde_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_main_cde_mve_fp } */
+
+/* Ensure this compiles.  */
+#include "arm_cde.h"
+int foo ()
+{
+  return 1;
+}

[wwwdocs] [arm] Document CDE intrinsics from ACLE

New in GCC 10.


### Attachment also inlined for ease of reply###


diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html
index 
3d8e0ba989e860d307310378a2be99b32a27261f..389561d13c69b650528e8ed8859ebbb5760438c6
 100644
--- a/htdocs/gcc-10/changes.html
+++ b/htdocs/gcc-10/changes.html
@@ -624,6 +624,9 @@ a work-in-progress.
   added: this M-profile feature is no longer restricted to targets
   with MOVT. For example, -mcpu=cortex-m0
   now supports this option.
+  Support for the Custom Datapath Extension ACLE
+  https://developer.arm.com/docs/101028/0010/custom-datapath-extension;>
+  intrinsics has been added.
 
 
 AVR

diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html
index 
3d8e0ba989e860d307310378a2be99b32a27261f..389561d13c69b650528e8ed8859ebbb5760438c6
 100644
--- a/htdocs/gcc-10/changes.html
+++ b/htdocs/gcc-10/changes.html
@@ -624,6 +624,9 @@ a work-in-progress.
   added: this M-profile feature is no longer restricted to targets
   with MOVT. For example, -mcpu=cortex-m0
   now supports this option.
+  Support for the Custom Datapath Extension ACLE
+  https://developer.arm.com/docs/101028/0010/custom-datapath-extension;>
+  intrinsics has been added.
 
 
 AVR

[PATCH] [Arm] Implement scalar Custom Datapath Extension intrinsics

Suggestions applied.  I didn't update the indentation since it seems good to me
(with the `(unspec:SIDI ...)` indented one more level than the SET as
suggested).  I'm guessing it looked strange since that line is indented with a
tab and that can look different in different text editors.


This patch introduces the scalar CDE (Custom Datapath Extension)
intrinsics for the arm backend.

There is nothing beyond the standard in this patch.  We simply build upon what
has been done by Dennis for the vector intrinsics.

We do add `+cdecp6` to the default arguments for `target-supports.exp`, this
allows for using coprocessor 6 in tests. This patch uses an alternate
coprocessor to ease assembler scanning by looking for a use of coprocessor 6.

We also ensure that any DImode registers are put in an even-odd register pair
when compiling for a target with CDE -- this avoids faulty code generation for
-Os when producing the cx*d instructions.

Testing done:
Bootstrapped and regtested for arm-none-linux-gnueabihf.

gcc/ChangeLog:

2020-04-08  Matthew Malcomson  

* config/arm/arm.c (arm_hard_regno_mode_ok): DImode registers forced
into even-odd register pairs for TARGET_CDE.
* config/arm/arm.h (ARM_CCDE_CONST_1): New.
(ARM_CCDE_CONST_2): New.
(ARM_CCDE_CONST_3): New.
* config/arm/arm.md (arm_cx1si, arm_cx1di arm_cx1asi, arm_cx1adi
arm_cx2si, arm_cx2di arm_cx2asi, arm_cx2adi arm_cx3si, arm_cx3di
arm_cx3asi, arm_cx3adi): New patterns.
* config/arm/arm_cde.h (__arm_cx1, __arm_cx1a, __arm_cx2, __arm_cx2a,
__arm_cx3, __arm_cx3a, __arm_cx1d, __arm_cx1da, __arm_cx2d, __arm_cx2da,
__arm_cx3d, __arm_cx3da): New ACLE function macros.
* config/arm/arm_cde_builtins.def (cx1, cx1a, cx2, cx2a, cx3, cx3a):
Define intrinsics.
* config/arm/iterators.md (cde_suffix, cde_dest): New mode attributes.
* config/arm/predicates.md (const_int_ccde1_operand,
const_int_ccde2_operand, const_int_ccde3_operand): New.
* config/arm/unspecs.md (UNSPEC_CDE, UNSPEC_CDEA): New.

gcc/testsuite/ChangeLog:

2020-04-08  Matthew Malcomson  

* gcc.target/arm/acle/cde-errors.c: New test.
* gcc.target/arm/acle/cde.c: New test.
* lib/target-supports.exp: Update CDE flags to enable coprocessor 6.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
ca36a74cd1fa161c388961588fa0f96030b7888e..83886a2fcb3844f6a5060e451125a6cd2d505c5c
 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -576,6 +576,9 @@ extern int arm_arch_cde;
 extern int arm_arch_cde_coproc;
 extern const int arm_arch_cde_coproc_bits[];
 #define ARM_CDE_CONST_COPROC   7
+#define ARM_CCDE_CONST_1   ((1 << 13) - 1)
+#define ARM_CCDE_CONST_2   ((1 << 9 ) - 1)
+#define ARM_CCDE_CONST_3   ((1 << 6 ) - 1)
 #define ARM_VCDE_CONST_1   ((1 << 11) - 1)
 #define ARM_VCDE_CONST_2   ((1 << 6 ) - 1)
 #define ARM_VCDE_CONST_3   ((1 << 3 ) - 1)
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
da0bfbc35501ba40324a38ee9ebc194f43196837..be076e4ac59be7f224b769bbca4013a554b50c07
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -25057,10 +25057,11 @@ arm_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
   if (ARM_NUM_REGS (mode) > 4)
return false;
 
-  if (TARGET_THUMB2 && !TARGET_HAVE_MVE)
+  if (TARGET_THUMB2 && !(TARGET_HAVE_MVE || TARGET_CDE))
return true;
 
-  return !(TARGET_LDRD && GET_MODE_SIZE (mode) > 4 && (regno & 1) != 0);
+  return !((TARGET_LDRD || TARGET_CDE)
+  && GET_MODE_SIZE (mode) > 4 && (regno & 1) != 0);
 }
 
   if (regno == FRAME_POINTER_REGNUM
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
6d5560398dae3d0ace0342b4907542d2a6865f70..9c4d66f4efe70d9ab8896865cbf45285e5cfbaf9
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -4408,6 +4408,70 @@
(set_attr "shift" "3")
(set_attr "type" "logic_shift_reg")])
 


+;; Custom Datapath Extension insns.
+(define_insn "arm_cx1"
+   [(set (match_operand:SIDI 0 "s_register_operand" "=r")
+(unspec:SIDI [(match_operand:SI 1 "const_int_coproc_operand" "i")
+  (match_operand:SI 2 "const_int_ccde1_operand" "i")]
+   UNSPEC_CDE))]
+   "TARGET_CDE"
+   "cx1\\tp%c1, , %2"
+)
+
+(define_insn "arm_cx1a"
+   [(set (match_operand:SIDI 0 "s_register_operand" "=r")
+(unspec:SIDI [(match_operand:SI 1 "const_int_coproc_operand" "i")
+  (match_operand:SIDI 2 "s_register_operand" "0")
+

[PATCH] [Arm] Implement scalar Custom Datapath Extension intrinsics

Patch rebased onto recent trunk.


This patch introduces the scalar CDE (Custom Datapath Extension)
intrinsics for the arm backend.

There is nothing beyond the standard in this patch.  We simply build upon what
has been done by Dennis for the vector intrinsics.

We do add `+cdecp6` to the default arguments for `target-supports.exp`, this
allows for using coprocessor 6 in tests. This patch uses an alternate
coprocessor to ease assembler scanning by looking for a use of coprocessor 6.

We also ensure that any DImode registers are put in an even-odd register pair
when compiling for a target with CDE -- this avoids faulty code generation for 
-Os
when producing the cx*d instructions.

Testing done:
Bootstrapped and regtested for arm-none-linux-gnueabihf.

gcc/ChangeLog:

2020-04-08  Matthew Malcomson  

* config/arm/arm.c (arm_hard_regno_mode_ok): DImode registers forced 
into
even-odd register pairs for TARGET_CDE.
* config/arm/arm.h (ARM_CCDE_CONST_1): New.
(ARM_CCDE_CONST_2): New.
(ARM_CCDE_CONST_3): New.
* config/arm/arm.md (arm_cx1si, arm_cx1di arm_cx1asi, arm_cx1adi 
arm_cx2si,
arm_cx2di arm_cx2asi, arm_cx2adi arm_cx3si, arm_cx3di arm_cx3asi,
arm_cx3adi): New patterns.
* config/arm/arm_cde.h (__arm_cx1, __arm_cx1a, __arm_cx2, __arm_cx2a,
__arm_cx3, __arm_cx3a, __arm_cx1d, __arm_cx1da, __arm_cx2d, __arm_cx2da,
__arm_cx3d, __arm_cx3da): New ACLE function macros.
* config/arm/arm_cde_builtins.def (cx1, cx1a, cx2, cx2a, cx3, cx3a): 
Define
intrinsics.
* config/arm/iterators.md (cde_suffix, cde_dest): New mode attributes.
* config/arm/predicates.md (const_int_ccde1_operand,
const_int_ccde2_operand, const_int_ccde3_operand): New.
* config/arm/unspecs.md (UNSPEC_CDE, UNSPEC_CDEA): New.

gcc/testsuite/ChangeLog:

2020-04-08  Matthew Malcomson  

* gcc.target/arm/acle/cde-errors.c: New test.
* gcc.target/arm/acle/cde.c: New test.
* lib/target-supports.exp: Update CDE flags to enable coprocessor 6.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
ca36a74cd1fa161c388961588fa0f96030b7888e..83886a2fcb3844f6a5060e451125a6cd2d505c5c
 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -576,6 +576,9 @@ extern int arm_arch_cde;
 extern int arm_arch_cde_coproc;
 extern const int arm_arch_cde_coproc_bits[];
 #define ARM_CDE_CONST_COPROC   7
+#define ARM_CCDE_CONST_1   ((1 << 13) - 1)
+#define ARM_CCDE_CONST_2   ((1 << 9 ) - 1)
+#define ARM_CCDE_CONST_3   ((1 << 6 ) - 1)
 #define ARM_VCDE_CONST_1   ((1 << 11) - 1)
 #define ARM_VCDE_CONST_2   ((1 << 6 ) - 1)
 #define ARM_VCDE_CONST_3   ((1 << 3 ) - 1)
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
da0bfbc35501ba40324a38ee9ebc194f43196837..be076e4ac59be7f224b769bbca4013a554b50c07
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -25057,10 +25057,11 @@ arm_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
   if (ARM_NUM_REGS (mode) > 4)
return false;
 
-  if (TARGET_THUMB2 && !TARGET_HAVE_MVE)
+  if (TARGET_THUMB2 && !(TARGET_HAVE_MVE || TARGET_CDE))
return true;
 
-  return !(TARGET_LDRD && GET_MODE_SIZE (mode) > 4 && (regno & 1) != 0);
+  return !((TARGET_LDRD || TARGET_CDE)
+  && GET_MODE_SIZE (mode) > 4 && (regno & 1) != 0);
 }
 
   if (regno == FRAME_POINTER_REGNUM
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
6d5560398dae3d0ace0342b4907542d2a6865f70..9c4d66f4efe70d9ab8896865cbf45285e5cfbaf9
 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -4408,6 +4408,70 @@
(set_attr "shift" "3")
(set_attr "type" "logic_shift_reg")])
 


+;; Custom Datapath Extension insns.
+(define_insn "arm_cx1"
+   [(set (match_operand:SIDI 0 "s_register_operand" "=r")
+(unspec:SIDI [(match_operand:SI 1 "const_int_coproc_operand" "i")
+  (match_operand:SI 2 "const_int_ccde1_operand" "i")]
+   UNSPEC_CDE))]
+   "TARGET_CDE"
+   "cx1\\tp%c1, , %2"
+)
+
+(define_insn "arm_cx1a"
+   [(set (match_operand:SIDI 0 "s_register_operand" "=r")
+(unspec:SIDI [(match_operand:SI 1 "const_int_coproc_operand" "i")
+  (match_operand:SIDI 2 "s_register_operand" "0")
+  (match_operand:SI 3 "const_int_ccde1_operand" "i")]
+   UNSPEC_CDEA))]
+   "TARGET_CDE"
+   "cx1a\\tp%c1, , %3"
+)
+
+(define_insn "arm_cx2"
+   [(set (match_operand:SIDI 0 &quo

[Arm] Implement CDE predicated intrinsics for MVE registers

This is an update of the previous patch but rebased onto recent MVE patches.
https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543414.html

These intrinsics are the predicated version of the intrinsics inroduced
in https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543527.html.

These are not yet public on developer.arm.com but we have reached
internal consensus on them.

The approach follows the same method as for the CDE intrinsics for MVE
registers, most notably using the same arm_resolve_overloaded_builtin
function with minor modifications.

The resolver hook has been moved from arm-builtins.c to arm-c.c so it
can access the c-common function build_function_call_vec.  This function
is needed to perform the same checks on arguments as a normal C or C++
function would perform.
It is fine to put this resolver in arm-c.c since it's only use is for
the ACLE functions, and these are only available in C/C++.
So that the resolver function has access to information it needs from
the builtins, we put two query functions into arm-builtins.c and use
them from arm-c.c.

We rely on the order that the builtins are defined in
gcc/config/arm/arm_cde_builtins.def, knowing that the predicated
versions come after the non-predicated versions.

The machine description patterns for these builtins are simpler than
those for the non-predicated versions, since the accumulator versions
*and* non-accumulator versions both need an input vector now.
The input vector is needed for the non-accumulator version to describe
the original values for those lanes that are not updated during the
merge operation.

We additionally need to introduce qualifiers for these new builtins,
which follow the same pattern as the non-predicated versions but with an
extra argument to describe the predicate.

Error message changes:
- We directly mention the builtin argument when complaining that an
  argument is not in the correct range.
  This more closely matches the C error messages.
- We ensure the resolver complains about *all* invalid arguments to a
  function instead of just the first one.
- The resolver error messages index arguments from 1 instead of 0 to
  match the arguments coming from the C/C++ frontend.

In order to allow the user to give an argument for the merging predicate
when they don't care what data is stored in the 'false' lanes, we also
move the __arm_vuninitializedq* intrinsics from arm_mve.h to
arm_mve_types.h which is shared with arm_cde.h.

We only move the fully type-specified `__arm_vuninitializedq*`
intrinsics and not the polymorphic versions, since moving the
polymorphic versions requires moving the _Generic framework as well as
just the intrinsics we're interested in.  This matches the approach taken
for the `__arm_vreinterpret*` functions in this include file.

This patch also contains a slight change in spacing of an existing
assembly instruction to be emitted.
This is just to help writing tests -- vmsr usually has a tab and a space
between the mnemonic and the first argument, but in one case it just has
a tab -- making all the same helps make test regexps simpler.

Testing Done:
Bootstrap and full regtest on arm-none-linux-gnueabihf
Full regtest on arm-none-eabi

gcc/ChangeLog:

2020-04-08  Matthew Malcomson  

* config/arm/arm-builtins.c (CX_UNARY_UNONE_QUALIFIERS): New.
(CX_BINARY_UNONE_QUALIFIERS): New.
(CX_TERNARY_UNONE_QUALIFIERS): New.
(arm_resolve_overloaded_builtin): Move to arm-c.c.
(arm_expand_builtin_args): Update error message.
(enum resolver_ident): New.
(arm_describe_resolver): New.
(arm_cde_end_args): New.
* config/arm/arm-builtins.h: New file.
* config/arm/arm-c.c (arm_resolve_overloaded_builtin): New.
(arm_resolve_cde_builtin): Moved from arm-builtins.c.
* config/arm/arm_cde.h (__arm_vcx1q_m, __arm_vcx1qa_m,
__arm_vcx2q_m, __arm_vcx2qa_m, __arm_vcx3q_m, __arm_vcx3qa_m):
New.
* config/arm/arm_cde_builtins.def (vcx1q_p_, vcx1qa_p_,
vcx2q_p_, vcx2qa_p_, vcx3q_p_, vcx3qa_p_): New builtin defs.
* config/arm/iterators.md (CDE_VCX): New int iterator.
(a) New int attribute.
* config/arm/mve.md (arm_vcx1q_p_v16qi, arm_vcx2q_p_v16qi,
arm_vcx3q_p_v16qi): New patterns.
* config/arm/vfp.md (thumb2_movhi_fp16): Extra space in assembly.

gcc/testsuite/ChangeLog:

2020-04-08  Matthew Malcomson  

* gcc.target/arm/acle/cde-errors.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-error-1.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-error-2.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-error-3.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-full-assembly.c: Add predicated
forms.
* gcc.target/arm/acle/cde-mve-tests.c: Add predicated forms.
* gcc.target/arm/acle/cde_v_1_err.c (test_imm_range): Update for
error message format change

[Arm] Implement CDE intrinsics for MVE registers.


This patch updates the previous one by rebasing onto the recent MVE patches on
trunk.

Implement CDE intrinsics on MVE registers.

Other than the basics required for adding intrinsics this patch consists
of three changes.

** We separate out the MVE types and casts from the arm_mve.h header.

This is so that the types can be used in arm_cde.h without the need to include
the entire arm_mve.h header.
The only type that arm_cde.h needs is `uint8x16_t`, so this separation could be
avoided by using a `typedef` in this file.
Since the introduced intrinsics are all defined to act on the full range of MVE
types, declaring all such types seems intuitive since it will provide their
declaration to the user too.

This arm_mve_types.h header not only includes the MVE types, but also
the conversion intrinsics between them.
Some of the conversion intrinsics are needed for arm_cde.h, but most are
not.  We include all conversion intrinsics to keep the definition of
such conversion functions all in one place, on the understanding that
extra conversion functions being defined when including `arm_cde.h` is
not a problem.

** We define the TARGET_RESOLVE_OVERLOADED_BUILTIN hook for the Arm backend.

This is needed to implement the polymorphism for the required intrinsics.
The intrinsics have no specialised version, and the resulting assembly
instruction for all different types should be exactly the same.
Due to this we have implemented these intrinsics via one builtin on one type.
All other calls to the intrinsic with different types are implicitly cast to
the one type that is defined, and hence are all expanded to the same RTL
pattern that is only defined for one machine mode.

** We seperate the initialisation of the CDE intrinsics from others.

This allows us to ensure that the CDE intrinsics acting on MVE registers
are only created when both CDE and MVE are available.
Only initialising these builtins when both features are available is
especially important since they require a type that is only initialised
when the target supports hard float.  Hence trying to initialise these
builtins on a soft float target would cause an ICE.

Testing done:
  Full bootstrap and regtest on arm-none-linux-gnueabihf
  Regression test on arm-none-eabi


NOTE: This patch depends on the CDE intrinsic framework patch by Dennis.
https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542008.html


Ok for trunk?

gcc/ChangeLog:

2020-04-08  Matthew Malcomson  

* config.gcc (arm_mve_types.h): New extra_header for arm.
* config/arm/arm-builtins.c (arm_resolve_overloaded_builtin): New.
(arm_init_cde_builtins): New.
(arm_init_acle_builtins): Remove initialisation of CDE builtins.
(arm_init_builtins): Call arm_init_cde_builtins when target
supports CDE.
* config/arm/arm-c.c (arm_resolve_overloaded_builtin): New declaration.
(arm_register_target_pragmas): Initialise resolve_overloaded_builtin
hook to the implementation for the arm backend.
* config/arm/arm.h (ARM_MVE_CDE_CONST_1): New.
(ARM_MVE_CDE_CONST_2): New.
(ARM_MVE_CDE_CONST_3): New.
* config/arm/arm_cde.h (__arm_vcx1q_u8): New.
(__arm_vcx1qa): New.
(__arm_vcx2q): New.
(__arm_vcx2q_u8): New.
(__arm_vcx2qa): New.
(__arm_vcx3q): New.
(__arm_vcx3q_u8): New.
(__arm_vcx3qa): New.
* config/arm/arm_cde_builtins.def (vcx1q, vcx1qa, vcx2q, vcx2qa, vcx3q,
vcx3qa): New builtins defined.
* config/arm/arm_mve.h: Move typedefs and conversion intrinsics
to arm_mve_types.h header.
* config/arm/arm_mve_types.h: New file.
* config/arm/mve.md (arm_vcx1qv16qi, arm_vcx1qav16qi, arm_vcx2qv16qi,
arm_vcx2qav16qi, arm_vcx3qv16qi, arm_vcx3qav16qi): New patterns.
* config/arm/predicates.md (const_int_mve_cde1_operand,
const_int_mve_cde2_operand, const_int_mve_cde3_operand): New.

gcc/testsuite/ChangeLog:

2020-04-08  Matthew Malcomson  
Dennis Zhang  

* gcc.target/arm/acle/cde-mve-error-1.c: New test.
* gcc.target/arm/acle/cde-mve-error-2.c: New test.
* gcc.target/arm/acle/cde-mve-error-3.c: New test.
* gcc.target/arm/acle/cde-mve-full-assembly.c: New test.
* gcc.target/arm/acle/cde-mve-tests.c: New test.
* lib/target-supports.exp (arm_v8_1m_main_cde_mve_fp): New check
effective.


cde-mve-intrinsics.patch.gz
Description: application/gzip

[Arm] Implement CDE predicated intrinsics for MVE registers

2020-04-07 Thread Matthew Malcomson

These intrinsics are the predicated version of the intrinsics inroduced
in https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542725.html.

These are not yet public on developer.arm.com but we have reached
internal consensus on them.

The approach follows the same method as for the CDE intrinsics for MVE
registers, most notably using the same arm_resolve_overloaded_builtin
function with minor modifications.

The resolver hook has been moved from arm-builtins.c to arm-c.c so it
can access the c-common function build_function_call_vec.  This function
is needed to perform the same checks on arguments as a normal C or C++
function would perform.
It is fine to put this resolver in arm-c.c since it's only use is for
the ACLE functions, and these are only available in C/C++.
So that the resolver function has access to information it needs from
the builtins, we put two query functions into arm-builtins.c and use
them from arm-c.c.

We rely on the order that the builtins are defined in
gcc/config/arm/arm_cde_builtins.def, knowing that the predicated
versions come after the non-predicated versions.

The machine description patterns for these builtins are simpler than
those for the non-predicated versions, since the accumulator versions
*and* non-accumulator versions both need an input vector now.
The input vector is needed for the non-accumulator version to describe
the original values for those lanes that are not updated during the
merge operation.

We additionally need to introduce qualifiers for these new builtins,
which follow the same pattern as the non-predicated versions but with an
extra argument to describe the predicate.

Error message changes:
- We directly mention the builtin argument when complaining that an
  argument is not in the correct range.
  This more closely matches the C error messages.
- We ensure the resolver complains about *all* invalid arguments to a
  function instead of just the first one.
- The resolver error messages index arguments from 1 instead of 0 to
  match the arguments coming from the C/C++ frontend.

In order to allow the user to give an argument for the merging predicate
when they don't care what data is stored in the 'false' lanes, we also
move the __arm_vuninitializedq* intrinsics from arm_mve.h to
arm_mve_types.h which is shared with arm_cde.h.

We only move the fully type-specified `__arm_vuninitializedq*`
intrinsics and not the polymorphic versions, since moving the
polymorphic versions requires moving the _Generic framework as well as
just the intrinsics we're interested in.  This matches the approach taken
for the `__arm_vreinterpret*` functions in this include file.

This patch also contains a slight change in spacing of an existing
assembly instruction to be emitted.
This is just to help writing tests -- vmsr usually has a tab and a space
between the mnemonic and the first argument, but in one case it just has
a tab -- making all the same helps make test regexps simpler.

Testing Done:
Bootstrap and full regtest on arm-none-linux-gnueabihf
Full regtest on arm-none-eabi

All testing done with a local fix for the bugzilla PR below.
That bugzilla currently causes multiple ICE's on the tests added in
this patch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94341

gcc/ChangeLog:

2020-04-07  Matthew Malcomson  

* config/arm/arm-builtins.c (CX_UNARY_UNONE_QUALIFIERS): New.
(CX_BINARY_UNONE_QUALIFIERS): New.
(CX_TERNARY_UNONE_QUALIFIERS): New.
(arm_resolve_overloaded_builtin): Move to arm-c.c.
(arm_expand_builtin_args): Update error message.
(enum resolver_ident): New.
(arm_describe_resolver): New.
(arm_cde_end_args): New.
* config/arm/arm-builtins.h: New file.
* config/arm/arm-c.c (arm_resolve_overloaded_builtin): New.
(arm_resolve_cde_builtin): Moved from arm-builtins.c.
* config/arm/arm_cde.h (__arm_vcx1q_m, __arm_vcx1qa_m,
__arm_vcx2q_m, __arm_vcx2qa_m, __arm_vcx3q_m, __arm_vcx3qa_m):
New.
* config/arm/arm_cde_builtins.def (vcx1q_p_, vcx1qa_p_,
vcx2q_p_, vcx2qa_p_, vcx3q_p_, vcx3qa_p_): New builtin defs.
* config/arm/iterators.md (CDE_VCX): New int iterator.
(a) New int attribute.
* config/arm/mve.md (arm_vcx1q_p_v16qi, arm_vcx2q_p_v16qi,
arm_vcx3q_p_v16qi): New patterns.
* config/arm/vfp.md (thumb2_movhi_fp16): Extra space in assembly.

gcc/testsuite/ChangeLog:

2020-04-07  Matthew Malcomson  

* gcc.target/arm/acle/cde-errors.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-error-1.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-error-2.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-error-3.c: Add predicated forms.
* gcc.target/arm/acle/cde-mve-full-assembly.c: Add predicated
forms.
* gcc.target/arm/acle/cde-mve-tests.c: Add predicated forms.
* gcc.target/arm/acle/cde_v_1_err.c

[Arm] Implement CDE intrinsics for MVE registers.

2020-03-26 Thread Matthew Malcomson

[Arm] Implement CDE intrinsics for MVE registers.

Implement CDE intrinsics on MVE registers.

Other than the basics required for adding intrinsics this patch consists
of three changes.

** We separate out the MVE types and casts from the arm_mve.h header.

This is so that the types can be used in arm_cde.h without the need to include
the entire arm_mve.h header.
The only type that arm_cde.h needs is `uint8x16_t`, so this separation could be
avoided by using a `typedef` in this file.
Since the introduced intrinsics are all defined to act on the full range of MVE
types, declaring all such types seems intuitive since it will provide their
declaration to the user too.

This arm_mve_types.h header not only includes the MVE types, but also
the conversion intrinsics between them.
Some of the conversion intrinsics are needed for arm_cde.h, but most are
not.  We include all conversion intrinsics to keep the definition of
such conversion functions all in one place, on the understanding that
extra conversion functions being defined when including `arm_cde.h` is
not a problem.

** We define the TARGET_RESOLVE_OVERLOADED_BUILTIN hook for the Arm backend.

This is needed to implement the polymorphism for the required intrinsics.
The intrinsics have no specialised version, and the resulting assembly
instruction for all different types should be exactly the same.
Due to this we have implemented these intrinsics via one builtin on one type.
All other calls to the intrinsic with different types are implicitly cast to
the one type that is defined, and hence are all expanded to the same RTL
pattern that is only defined for one machine mode.

** We seperate the initialisation of the CDE intrinsics from others.

This allows us to ensure that the CDE intrinsics acting on MVE registers
are only created when both CDE and MVE are available.
Only initialising these builtins when both features are available is
especially important since they require a type that is only initialised
when the target supports hard float.  Hence trying to initialise these
builtins on a soft float target would cause an ICE.

Testing done:
  Full bootstrap and regtest on arm-none-linux-gnueabihf
  Regression test on arm-none-eabi

NOTE: These tests ICE on arm-none-eabi.
This is due to bugzilla bug 94341, and not due to a problem in this patch.

NOTE2: This patch depends on the CDE intrinsic framework patch by
Dennis.
https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542008.html


Ok for trunk?

gcc/ChangeLog:

2020-03-26  Matthew Malcomson  

* config.gcc (arm_mve_types.h): New extra_header for arm.
* config/arm/arm-builtins.c (arm_resolve_overloaded_builtin): New.
(arm_init_cde_builtins): New.
(arm_init_acle_builtins): Remove initialisation of CDE builtins.
(arm_init_builtins): Call arm_init_cde_builtins when target
supports CDE.
* config/arm/arm-c.c (arm_resolve_overloaded_builtin): New declaration.
(arm_register_target_pragmas): Initialise resolve_overloaded_builtin
hook to the implementation for the arm backend.
* config/arm/arm.h (ARM_MVE_CDE_CONST_1): New.
(ARM_MVE_CDE_CONST_2): New.
(ARM_MVE_CDE_CONST_3): New.
* config/arm/arm_cde.h (__arm_vcx1q_u8): New.
(__arm_vcx1qa): New.
(__arm_vcx2q): New.
(__arm_vcx2q_u8): New.
(__arm_vcx2qa): New.
(__arm_vcx3q): New.
(__arm_vcx3q_u8): New.
(__arm_vcx3qa): New.
* config/arm/arm_cde_builtins.def (vcx1q, vcx1qa, vcx2q, vcx2qa, vcx3q,
vcx3qa): New builtins defined.
* config/arm/arm_mve.h: Move typedefs and conversion intrinsics
to arm_mve_types.h header.
* config/arm/arm_mve_types.h: New file.
* config/arm/mve.md (arm_vcx1qv16qi, arm_vcx1qav16qi, arm_vcx2qv16qi,
arm_vcx2qav16qi, arm_vcx3qv16qi, arm_vcx3qav16qi): New patterns.
* config/arm/predicates.md (const_int_mve_cde1_operand,
const_int_mve_cde2_operand, const_int_mve_cde3_operand): New.

gcc/testsuite/ChangeLog:

2020-03-26  Matthew Malcomson  
Dennis Zhang  

* gcc.target/arm/acle/cde-mve-error-1.c: New test.
* gcc.target/arm/acle/cde-mve-error-2.c: New test.
* gcc.target/arm/acle/cde-mve-error-3.c: New test.
* gcc.target/arm/acle/cde-mve-full-assembly.c: New test.
* gcc.target/arm/acle/cde-mve-tests.c: New test.
* lib/target-supports.exp (arm_v8_1m_main_cde_mve_fp): New check
effective.


cde_mve_regs.patch.gz
Description: application/gzip

Fwd: [testsuite] Add @ lines to check-function-bodies fluff

2020-03-10 Thread Matthew Malcomson


Cc'ing maintainers and original author of `check-function-bodies`.

It looks like I missed that the first time around.


 Forwarded Message 
Subject: [testsuite] Add @ lines to check-function-bodies fluff
Date: Tue, 10 Mar 2020 17:22:52 +
From: Matthew Malcomson 
To: gcc-patches@gcc.gnu.org
CC: n...@arm.com

When using `check-function-bodies`, the subroutine 
`parse_function_bodies` uses

the `fluff` regexp to remove uninteresting assembly lines.

Arm targets generate assembly with some lines prefixed by `@`, these 
lines are

left by this process.

As an example of some lines prefixed by `@': the assembly output from the
`stacktest1` function in "bfloat16_simd_3_1.c" is:

.align  2
.global stacktest1
.arch armv8.2-a
.syntax unified
.arm
.fpu neon-fp-armv8
.type   stacktest1, %function
stacktest1:
@ args = 0, pretend = 0, frame = 8
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
sub sp, sp, #8
add r3, sp, #6
vst1.16 {d0[0]}, [r3]
vld1.16 {d0[0]}, [r3]
add sp, sp, #8
@ sp needed
bx  lr
.size   stacktest1, .-stacktest1



It seems that previous uses of `check-function-bodies` in the arm 
backend have
avoided problems with such lines since they use the `...` regexp in each 
place

such fluff occurs.

I'm currently writing a patch that I'd like to match the entire function 
body,

so I'd like to remove such `@` lines automatically.

gcc/testsuite/ChangeLog:

2020-03-10  Matthew Malcomson  

* lib/scanasm.exp (parse_function_bodies): Lines starting with '@' also
counted as fluff.



### Attachment also inlined for ease of reply 
###



diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index 
5ca58d4042027683da12bc2a1d161195cd6439e7..f7d27735112f8edd8a39a326020c3d08dd36e765 
100644

--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -569,7 +569,7 @@ proc parse_function_bodies { filename result } {
 set terminator {^\s*\.size}
  # Regexp for lines that aren't interesting.
-set fluff {^\s*(?:\.|//)}
+set fluff {^\s*(?:\.|//|@)}
  set fd [open $filename r]
 set in_function 0


diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index 
5ca58d4042027683da12bc2a1d161195cd6439e7..f7d27735112f8edd8a39a326020c3d08dd36e765
 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -569,7 +569,7 @@ proc parse_function_bodies { filename result } {
 set terminator {^\s*\.size}
 
 # Regexp for lines that aren't interesting.
-set fluff {^\s*(?:\.|//)}
+set fluff {^\s*(?:\.|//|@)}
 
 set fd [open $filename r]
 set in_function 0

[testsuite] Add @ lines to check-function-bodies fluff

2020-03-10 Thread Matthew Malcomson

When using `check-function-bodies`, the subroutine `parse_function_bodies` uses
the `fluff` regexp to remove uninteresting assembly lines.

Arm targets generate assembly with some lines prefixed by `@`, these lines are
left by this process.

As an example of some lines prefixed by `@': the assembly output from the
`stacktest1` function in "bfloat16_simd_3_1.c" is:

.align  2
.global stacktest1
.arch armv8.2-a
.syntax unified
.arm
.fpu neon-fp-armv8
.type   stacktest1, %function
stacktest1:
@ args = 0, pretend = 0, frame = 8
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
sub sp, sp, #8
add r3, sp, #6
vst1.16 {d0[0]}, [r3]
vld1.16 {d0[0]}, [r3]
add sp, sp, #8
@ sp needed
bx  lr
.size   stacktest1, .-stacktest1



It seems that previous uses of `check-function-bodies` in the arm backend have
avoided problems with such lines since they use the `...` regexp in each place
such fluff occurs.

I'm currently writing a patch that I'd like to match the entire function body,
so I'd like to remove such `@` lines automatically.

gcc/testsuite/ChangeLog:

2020-03-10  Matthew Malcomson  

* lib/scanasm.exp (parse_function_bodies): Lines starting with '@' also
counted as fluff.



### Attachment also inlined for ease of reply###


diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index 
5ca58d4042027683da12bc2a1d161195cd6439e7..f7d27735112f8edd8a39a326020c3d08dd36e765
 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -569,7 +569,7 @@ proc parse_function_bodies { filename result } {
 set terminator {^\s*\.size}
 
 # Regexp for lines that aren't interesting.
-set fluff {^\s*(?:\.|//)}
+set fluff {^\s*(?:\.|//|@)}
 
 set fd [open $filename r]
 set in_function 0

diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index 
5ca58d4042027683da12bc2a1d161195cd6439e7..f7d27735112f8edd8a39a326020c3d08dd36e765
 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -569,7 +569,7 @@ proc parse_function_bodies { filename result } {
 set terminator {^\s*\.size}
 
 # Regexp for lines that aren't interesting.
-set fluff {^\s*(?:\.|//)}
+set fluff {^\s*(?:\.|//|@)}
 
 set fd [open $filename r]
 set in_function 0

[AArch64] effective_target for aarch64 f64mm asm

2020-01-21 Thread Matthew Malcomson

Commit 9ceec73 introduced intrinsics for the AArch64 FP64 matrix
multiply instructions.  These require binutils support for the same
instructions.
( See https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01234.html for the
testsuite failures this introduced. )

This patch adds a DejaGNU test to ensure this binutils support is there
and uses it in the files that need this test.

NOTE:
I tried to find some way to run the assembly tests if the given version
of binutils is available, but run the compile tests if not.  This is
pretty awkward -- It seems I either have to duplicate all the DejaGNU
comments between two files, or write filename exceptions into the
list of files that aarch64-sve-acle-asm.exp runs tests for.

I decided to not do either, since I figure not running the tests on
older binutils isn't too bad compared to having a bunch more DejaGNU
stuff making the tests harder to read.

Testing Done:
Checked on a cross-compiler that:
Tests running for binutils commit e264b5b7a are listed as UNSUPPORTED.
Tests running for binutils commit 26916852e all pass.

gcc/testsuite/ChangeLog:

2020-01-21  Matthew Malcomson  

* gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c: Use require
directive.
* gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c: Likewise.
* lib/target-supports.exp: Add assembly requirement directive.



### Attachment also inlined for ease of reply###


diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c
index 
7badc75a43ab2009e9406afc04c980fc01834716..6eb94f1ca5fda961bb23f5c4cf66bc5694d26f36
 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c
@@ -1,5 +1,6 @@
 /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */
 /* { dg-additional-options "-march=armv8.6-a+sve+f64mm" } */
+/* { dg-require-effective-target aarch64_asm_f64mm_ok }  */
 
 #include "test_sve_acle.h"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c
index 
dd8a1c53cd0fb7b7acd0b92394f3977382ac26e0..0a77c37ddd5978db765b4bad23f24da82a0aed11
 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c
@@ -1,5 +1,6 @@
 /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */
 /* { dg-additional-options "-march=armv8.6-a+sve+f64mm" } */
+/* { dg-require-effective-target aarch64_asm_f64mm_ok }  */
 
 #include "test_sve_acle.h"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c
index 
30563698310f65060d34be4bef4c57a74ef9d734..65c6d9b02b804a1480cd5014c8f0ca5f534bacbe
 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c
@@ -1,5 +1,6 @@
 /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */
 /* { dg-additional-options "-march=armv8.6-a+sve+f64mm" } */
+/* { dg-require-effective-target aarch64_asm_f64mm_ok }  */
 
 #include "test_sve_acle.h"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c
index 
d4702fa6cc15e9f93751d8579cfecfd37759306e..e3dc9bd51cf93c2f97e4277181e686d0bf53a1ea
 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c
@@ -1,5 +1,6 @@
 /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */
 /* { dg-additional-options "-march=armv8.6-a+sve+f64mm" } */
+/* { dg-require-effective-target aarch64_asm_f64mm_ok }  */
 
 #include "test_sve_acle.h"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c
index 
4604b0b5fbfb716ae814bf88f7acfe8bf0eaa9f5..f3af8e5cc25791a56930618abf5c092d51e94e9b
 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c
@@ -1,5 +1,6 @@
 /

Re: [Patch] [AArch64] [SVE] Implement svld1ro intrinsic.

2020-01-20 Thread Matthew Malcomson

On 20/01/2020 14:53, Christophe Lyon wrote:
> On Thu, 9 Jan 2020 at 16:53, Matthew Malcomson
>  wrote:
>>
>> +   (match_test "aarch64_sve_ld1ro_operand_p (op, DImode)")))
>> +
> 
> 
>>   (define_predicate "aarch64_sve_ldff1_operand"
>> (and (match_code "mem")
> 
>>  (match_test "aarch64_sve_ldff1_operand_p (op)")))
> 
> 
> Hi,
> 
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c 
>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c
>> new file mode 100644
>> index 
>> ..7badc75a43ab2009e9406afc04c980fc01834716
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c
>> @@ -0,0 +1,119 @@
>> +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */
>> +/* { dg-additional-options "-march=armv8.6-a+sve+f64mm" } */
> 
> What is the binutils version requirement for this?
> Some validations using binutils-2.33.1 exhibit failures like:
> /xgcc 
> -B/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/gcc3/gcc/
> -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
> -fdiagnostics-color=never -fdiagnostics-urls=never -std=c90 -O0 -g
> -DTEST_FULL -march=armv8.2-a+sve -fno-ipa-icf
> -march=armv8.6-a+sve+f64mm -c -o ld1ro_s16.o
> /gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c
> Assembler messages:
> Error: unknown architecture `armv8.6-a+sve+f64mm'
> 
> Error: unrecognized option -march=armv8.6-a+sve+f64mm
> compiler exited with status 1
> FAIL: gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c  -std=c90 -O0 -g
> -DTEST_FULL  1 blank line(s) in output
> FAIL: gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c  -std=c90 -O0 -g
> -DTEST_FULL (test for excess errors)
> Excess errors:
> Assembler messages:
> Error: unknown architecture `armv8.6-a+sve+f64mm'
> Error: unrecognized option -march=armv8.6-a+sve+f64mm
> 
> 
> while other configurations using 2.32 binutils seem to pass this test:
> /xgcc 
> -B/home/tcwg-buildslave/workspace/tcwg-buildfarm__0/_build/builds/aarch64-unknown-linux-gnu/aarch64-unknown-linux-gnu/gcc.git~master_rev_3684bbb022cd75da55e1457673f269980aa12cdf-stage2/gcc/
> /home/tcwg-buildslave/workspace/tcwg-buildfarm__0/snapshots/gcc.git~master_rev_3684bbb022cd75da55e1457673f269980aa12cdf/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c
> -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
> -fdiagnostics-color=never -fdiagnostics-urls=never -std=c90 -O0 -g
> -DTEST_FULL -march=armv8.2-a+sve -fno-ipa-icf
> -march=armv8.6-a+sve+f64mm -S -o ld1ro_f16.s
> PASS: gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c  -std=c90 -O0 -g
> -DTEST_FULL (test for excess errors)
> 
> Ha... took me a while to realize that in the latter case we stop after
> generating the .s file and do not call the assembler...
> 
> 
> So... do we want/need additional consistency checks between gcc and
> gas versions?
> 

Ah! Yes, that should certainly be done.
I'll start that now.

Thanks for pointing it out -- it seems I really did not take enough care 
with this patch...

MM

> 
> Thanks,
> 
> Christophe
> 
> 
>> +

[PATCH][AArch64] Enable CLI for Armv8.6-A f64mm

2020-01-13 Thread Matthew Malcomson

This patch is necessary for sve-ld1ro intrinsic I posted in
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00466.html .

I had mistakenly thought this option was already enabled upstream.

This provides the option +f64mm, that turns on the 64 bit floating point
matrix multiply extension.  This extension is only available for
AArch64.  Turning on this extension also turns on the SVE extension.

This extension is optional and only available at Armv8.2-A and onward.

We also add the ACLE defined macro for this extension.

Tested on aarch64 cross compiler from x86.

gcc/ChangeLog:

2020-01-13  Matthew Malcomson  

* config/aarch64/aarch64-c.c (_ARM_FEATURE_MATMUL_FLOAT64):
Introduce this ACLE specified predefined macro.
* config/aarch64/aarch64-option-extensions.def (f64mm): New.
(fp): Disabling this disables f64mm.
(simd): Disabling this disables f64mm.
(fp16): Disabling this disables f64mm.
(sve): Disabling this disables f64mm.
* config/aarch64/aarch64.h (AARCH64_FL_F64MM): New.
(AARCH64_ISA_F64MM): New.
(TARGET_F64MM): New.
* doc/invoke.texi (f64mm): Document new option.

gcc/testsuite/ChangeLog:

2020-01-13  Matthew Malcomson  

* gcc.target/aarch64/pragma_cpp_predefs_2.c: Check for f64mm
predef.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index 
9ccca422b045fcd6875f192680220b2af9d59d31..22fc4a5e337acb4cdd96618b57736f55892bd0e7
 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -166,6 +166,7 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
   aarch64_def_or_undef (TARGET_MEMTAG, "__ARM_FEATURE_MEMORY_TAGGING", pfile);
 
   aarch64_def_or_undef (TARGET_I8MM, "__ARM_FEATURE_MATMUL_INT8", pfile);
+  aarch64_def_or_undef (TARGET_F64MM, "__ARM_FEATURE_MATMUL_FP64", pfile);
   aarch64_def_or_undef (TARGET_BF16_SIMD,
"__ARM_FEATURE_BF16_VECTOR_ARITHMETIC", pfile);
   aarch64_def_or_undef (TARGET_BF16_FP,
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def 
b/gcc/config/aarch64/aarch64-option-extensions.def
index 
5022a1b3552f35364e35b3955bf2c39a33ab0752..6057b33033a0f3a8be7d656dc1e459d4d93b2842
 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -53,26 +53,26 @@
 /* Enabling "fp" just enables "fp".
Disabling "fp" also disables "simd", "crypto", "fp16", "aes", "sha2",
"sha3", sm3/sm4, "sve", "sve2", "sve2-aes", "sve2-sha3", "sve2-sm4",
-   "sve2-bitperm", "i8mm" and "bf16".  */
+   "sve2-bitperm", "i8mm", "f64mm", and "bf16".  */
 AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, 0, AARCH64_FL_SIMD | \
  AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | \
  AARCH64_FL_SHA2 | AARCH64_FL_SHA3 | AARCH64_FL_SM4 | \
  AARCH64_FL_SVE | AARCH64_FL_SVE2 | AARCH64_FL_SVE2_AES | \
  AARCH64_FL_SVE2_SHA3 | AARCH64_FL_SVE2_SM4 | \
  AARCH64_FL_SVE2_BITPERM | AARCH64_FL_I8MM | \
- AARCH64_FL_BF16, false, "fp")
+ AARCH64_FL_F64MM | AARCH64_FL_BF16, false, "fp")
 
 /* Enabling "simd" also enables "fp".
Disabling "simd" also disables "crypto", "dotprod", "aes", "sha2", "sha3",
"sm3/sm4", "sve", "sve2", "sve2-aes", "sve2-sha3", "sve2-sm4",
-   "sve2-bitperm", and "i8mm".  */
+   "sve2-bitperm", "i8mm", and "f64mm".  */
 AARCH64_OPT_EXTENSION("simd", AARCH64_FL_SIMD, AARCH64_FL_FP, \
  AARCH64_FL_CRYPTO | AARCH64_FL_DOTPROD | \
  AARCH64_FL_AES | AARCH64_FL_SHA2 | AARCH64_FL_SHA3 | \
  AARCH64_FL_SM4 | AARCH64_FL_SVE | AARCH64_FL_SVE2 | \
  AARCH64_FL_SVE2_AES | AARCH64_FL_SVE2_SHA3 | \
  AARCH64_FL_SVE2_SM4 | AARCH64_FL_SVE2_BITPERM | \
- AARCH64_FL_I8MM, false, \
+ AARCH64_FL_I8MM | AARCH64_FL_F64MM, false, \
  "asimd")
 
 /* Enabling "crypto" also enables "fp", "simd", "aes" and "sha2".
@@ -92,12 +92,13 @@ AARCH64_OPT_EXTENSION("crc", AARCH64_FL_CRC, 0, 0, false, 
"crc32")
 AARCH64_OPT_EXTENSION("lse", AARCH64_FL_LSE, 0, 0, false, "atomics")
 
 /* Enabling "fp16" also enables &q

Re: [PATCH 6/X] [libsanitizer] Add hwasan pass and associated gimple changes

2020-01-13 Thread Matthew Malcomson

> On 12/12/19 4:19 PM, Matthew Malcomson wrote:
>> - if (is_store && !param_asan_instrument_writes)
>> + if (is_store
>> + && (!param_asan_instrument_writes || !param_hwasan_instrument_writes))
>> return;
>> - if (!is_store && !param_asan_instrument_reads)
>> + if (!is_store
>> + && (!param_asan_instrument_reads || !param_hwasan_instrument_reads))
>> return;
> 
> I know it's very unlikely, but one can use -fsanitize=address and
> --param hwasan-instrument-reads=0 which will drop instrumentation of reads
> for ASAN.

Ah! Thanks for the catch.
Updated patch is attached and has been tested.

I've also attached the patch including the new `Optimization` keyword to the
hwasan parameters to this email -- (putting both on this email to avoid a bit of
email spam).

> 
> Similarly for other parameters.
> 
> Martin


Inlining the new bit that avoids the problem you pointed out above, since
the implementation of that is the only new part someone might object to.


###

diff --git a/gcc/asan.c b/gcc/asan.c
index 
fe6841b4f084f75be534cc9653079ca0a5bdc94e..55723bf4d5d2a4111eb574d169f21332f6eb33ff
 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -326,6 +326,25 @@ asan_sanitize_allocas_p (void)
   return (asan_sanitize_stack_p () && param_asan_protect_allocas);
 }
 
+bool
+asan_instrument_reads (void)
+{
+  return (sanitize_flags_p (SANITIZE_ADDRESS) && param_asan_instrument_reads);
+}
+
+bool
+asan_instrument_writes (void)
+{
+  return (sanitize_flags_p (SANITIZE_ADDRESS) && param_asan_instrument_writes);
+}
+
+bool
+asan_memintrin (void)
+{
+  return (sanitize_flags_p (SANITIZE_ADDRESS) && param_asan_memintrin);
+}
+
+
 /* Checks whether section SEC should be sanitized.  */
 
 static bool
@@ -1382,6 +1673,28 @@ hwasan_sanitize_allocas_p (void)
   return (hwasan_sanitize_stack_p () && param_hwasan_protect_allocas);
 }
 
+/* Should we instrument reads?  */
+bool
+hwasan_instrument_reads (void)
+{
+  return (hwasan_sanitize_p () && param_hwasan_instrument_reads);
+}
+
+/* Should we instrument writes?  */
+bool
+hwasan_instrument_writes (void)
+{
+  return (hwasan_sanitize_p () && param_hwasan_instrument_writes);
+}
+
+/* Should we instrument builtin calls?  */
+bool
+hwasan_memintrin (void)
+{
+  return (hwasan_sanitize_p () && param_hwasan_memintrin);
+}
+
+
 /* Insert code to protect stack vars.  The prologue sequence should be emitted
directly, epilogue sequence returned.  BASE is the register holding the
stack base, against which OFFSETS array offsets are relative to, OFFSETS
@@ -2220,9 +2539,9 @@ static void
 instrument_derefs (gimple_stmt_iterator *iter, tree t,
   location_t location, bool is_store)
 {
-  if (is_store && !param_asan_instrument_writes)
+  if (is_store && !(asan_instrument_writes () || hwasan_instrument_writes ()))
 return;
-  if (!is_store && !param_asan_instrument_reads)
+  if (!is_store && !(asan_instrument_reads () || hwasan_instrument_reads ()))
 return;
 
   tree type, base;
@@ -2376,7 +2696,7 @@ instrument_mem_region_access (tree base, tree len,
 static bool
 instrument_builtin_call (gimple_stmt_iterator *iter)
 {
-  if (!param_asan_memintrin)
+  if (!(asan_memintrin () || hwasan_memintrin ()))
 return false;
 
   bool iter_advanced_p = false;

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 
77649e839c968c38485ea30d332576b9b089effe..c5746e3c89283ef0652de85cf6b94e76263aba38
 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -54,6 +54,8 @@ static tree handle_cold_attribute (tree *, tree, tree, int, 
bool *);
 static tree handle_no_sanitize_attribute (tree *, tree, tree, int, bool *);
 static tree handle_no_sanitize_address_attribute (tree *, tree, tree,
  int, bool *);
+static tree handle_no_sanitize_hwaddress_attribute (tree *, tree, tree,
+   int, bool *);
 static tree handle_no_sanitize_thread_attribute (tree *, tree, tree,
 int, bool *);
 static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree,
@@ -412,6 +414,8 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_no_sanitize_attribute, NULL },
   { "no_sanitize_address",0, 0, true, false, false, false,
  handle_no_sanitize_address_attribute, NULL },
+  { "no_sanitize_hwaddress",0, 0, true, false, false, false,
+ handle_no_sanitize_hwaddress_attribute, NULL },
   { "no_sanitize_thread", 0, 0, true, false, false, false,
  handle_no_sanitize_thread_attribute, NULL

Re: Document --with-build-config=bootstrap-asan option.

2020-01-13 Thread Matthew Malcomson

On 11/01/2020 07:19, Gerald Pfeifer wrote:
> On Thu, 12 Dec 2019, Matthew Malcomson wrote:
>> gcc/ChangeLog:
>>
>> 2019-12-12  Matthew Malcomson  
>>
>>  * doc/install.texi: Document bootstrap-asan configuration option.
> 
> I see this introduces a new table.
> 
>> +Some examples of build configurations designed for developers of GCC are:
> 
> @samp{bootstrap-time}, @samp{bootstrap-debug-ckovw} and others appear
> to fall into the same camp, essentially expected to be used by maintainers
> only.
> 
> Would it make sense to add your new option to the existing table, or
> perhaps see which other options from the existing table to move into
> your new one?  Thoughts?

Sounds good me.


> 
> 
> The patch is okay modulo the question above.
> 
> Thanks,
> Gerald
> 

Patch with above suggestion.

#

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 
80b47812fe66a8ef50edf3aad9708ab3409ba7dc..0705759c69f64c6d06e91f7ae83bb8c1ad210f34
 
100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2668,6 +2668,10 @@ Arranges for the run time of each program started 
by the GCC driver,
  built in any stage, to be logged to @file{time.log}, in the top level of
  the build tree.

+@item @samp{bootstrap-asan}
+Compiles GCC itself using Address Sanitization in order to catch 
invalid memory
+accesses within the GCC code.
+
  @end table

  @section Building a cross compiler

[Patch] [AArch64] [SVE] Implement svld1ro intrinsic.

2020-01-09 Thread Matthew Malcomson

We take no action to ensure the SVE vector size is large enough.  It is
left to the user to check that before compiling this intrinsic or before
running such a program on a machine.

The main difference between ld1ro and ld1rq is in the allowed offsets,
the implementation difference is that ld1ro is implemented using integer
modes since there are no pre-existing vector modes of the relevant size.
Adding new vector modes simply for this intrinsic seems to make the code
less tidy.

Specifications can be found under the "Arm C Language Extensions for
Scalable Vector Extension" title at
https://developer.arm.com/architectures/system-architectures/software-standards/acle

gcc/ChangeLog:

2020-01-09  Matthew Malcomson  

* config/aarch64/aarch64-protos.h
(aarch64_sve_ld1ro_operand_p): New.
* config/aarch64/aarch64-sve-builtins-base.cc
(class load_replicate): New.
(class svld1ro_impl): New.
(class svld1rq_impl): Change to inherit from load_replicate.
(svld1ro): New sve intrinsic function base.
* config/aarch64/aarch64-sve-builtins-base.def (svld1ro):
New DEF_SVE_FUNCTION.
* config/aarch64/aarch64-sve-builtins-base.h
(svld1ro): New decl.
* config/aarch64/aarch64-sve-builtins.cc
(function_expander::add_mem_operand): Modify assert to allow
OImode.
* config/aarch64/aarch64-sve.md (@aarch64_sve_ld1ro): New
pattern.
* config/aarch64/aarch64.c
(aarch64_sve_ld1rq_operand_p): Implement in terms of ...
(aarch64_sve_ld1rq_ld1ro_operand_p): This.
(aarch64_sve_ld1ro_operand_p): New.
* config/aarch64/aarch64.md (UNSPEC_LD1RO): New unspec.
* config/aarch64/constraints.md (UOb,UOh,UOw,UOd): New.
* config/aarch64/predicates.md
(aarch64_sve_ld1ro_operand_{b,h,w,d}): New.

gcc/testsuite/ChangeLog:

2020-01-09  Matthew Malcomson  

* gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c: New test.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c: New test.



### Attachment also inlined for ease of reply###


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
c16b9362ea986ff221755bfc4d10bae674a67ed4..6d2162b93932e433677dae48e5c58975be2902d2
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -582,6 +582,7 @@ rtx aarch64_simd_gen_const_vector_dup (machine_mode, 
HOST_WIDE_INT);
 bool aarch64_simd_mem_operand_p (rtx);
 bool aarch64_sve_ld1r_operand_p (rtx);
 bool aarch64_sve_ld1rq_operand_p (rtx);
+bool aarch64_sve_ld1ro_operand_p (rtx, scalar_mode);
 bool aarch64_sve_ldff1_operand_p (rtx);
 bool aarch64_sve_ldnf1_operand_p (rtx);
 bool aarch64_sve_ldr_operand_p (rtx);
diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index 
38bd3adce1ebbde4c58531ffd26eedd4ae4938b0..e52a6012565fadd84cdd77a613f887e5ae53a576
 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -1139,7 +1139,7 @@ public:
   }
 };
 
-class svld1rq_impl : public function_base
+class load_replicate : public function_base
 {
 public:
   unsigned int
@@ -1153,7 +1153,11 @@ public:
   {
 return fi.scalar_type (0);
   }
+};
 
+class svld1rq_impl : public load_replicate
+{
+public:
   machine_mode
   memory_vector_mode (const function_instance ) const OVERRIDE
   {
@@ -1168,6 +1172,23 @@ public:
   }
 };
 
+class svld1ro_impl : public load_replicate
+{
+public:
+  machine_mode
+  memory_vector_mode (const function_instance ) const OVERRIDE
+  {
+return OImode;
+  }
+
+  rtx
+  expand (function_expander ) const OVERRIDE
+  {
+insn_code icode = code_for_aarch64_sve_ld1ro (e.vector_mode (0));
+return e.use_contiguous_load_insn (icode);
+  }
+};
+
 /* Implements svld2, svld3 and svld4.  */
 class svld234_impl : public full_width_access
 {
@@ -2571,6 +2592,7 @@ FUNCTION (svlasta, svlast_impl, (UNSPEC_LASTA))
 FUNCTION (svlastb, svlast_impl, (UNSPEC_LASTB))
 FUNCTION (svld1, svld1_impl,)
 FUNCTION (svld1_gather, svld1_gather_impl,)
+FUNCTION (svld1ro, svld1ro_impl,)
 FUNCTION (svld1rq, svld1rq_impl,)
 FUNCTION (svld1sb, svld1_extend_impl, (TYPE_SUFFIX_s8))
 FUNCTION (svld1sb_gather, svld1_gather_extend_impl, (TYPE_SUFFIX_s8))
diff --git a/gcc/conf

Re: [Patch 0/X] HWASAN v3

2020-01-08 Thread Matthew Malcomson

Hi everyone,

I'm writing this email to summarise & publicise the state of this patch 
series, especially the difficulties around approval for GCC 10 mentioned 
on IRC.

The main obstacle seems to be that no maintainer feels they have enough 
knowledge about hwasan and justification that it's worthwhile to approve 
the patch series.

Similarly, Martin has given a review of the parts of the code he can 
(thanks!), but doesn't feel he can do a deep review of the code related 
to the RTL hooks and stack expansion -- hence that part is as yet not 
reviewed in-depth.

The questions around justification raised on IRC are mainly that it 
seems like a proof-of-concept for MTE rather than a stand-alone useable 
sanitizer.  Especially since in the GNU world hwasan instrumented code 
is not really ready for production since we can only use the 
less-"interceptor ABI" rather than the "platform ABI".  This restriction 
is because there is no version of glibc with the required modifications 
to provide the "platform ABI".

(n.b. that since https://reviews.llvm.org/D69574 the code-generation for 
these ABI's is the same).

 From my perspective the reasons that make HWASAN useful in itself are:

1) Much less memory usage.

 From a back-of-the-envelope calculation based on the hwasan paper's 
table of memory overhead from over-alignment 
https://arxiv.org/pdf/1802.09517.pdf  I guess hwasan instrumented code 
has an overhead of about 1.1x (~4% from overalignment and ~6.25% from 
shadow memory), while asan seems to have an overhead somewhere in the 
range 1.5x - 3x.

Maybe there's some data out there comparing total overheads that I 
haven't found? (I'd appreciate a reference if anyone has that info).

2) Available on more architectures that MTE.

HWASAN only requires TBI, which is a feature of all AArch64 machines, 
while MTE will be an optional extension and only available on certain 
architectures.

3) This enables using hwasan in the kernel.

While instrumented user-space applications will be using the 
"interceptor ABI" and hence are likely not production-quality, the 
biggest aim of implementing hwasan in GCC is to allow building the Linux 
kernel with tag-based sanitization using GCC.

Instrumented kernel code uses hooks in the kernel itself, so this ABI 
distinction is no longer relevant, and this sanitizer should produce a 
production-quality kernel binary.

I'm hoping I can find a maintainer willing to review and ACK this patch 
series -- especially with stage3 coming to a close soon.  If there's 
anything else I could do to help get someone willing up-to-speed then 
please just ask.

Cheers,
Matthew

On 07/01/2020 15:14, Martin Liška wrote:
> On 12/12/19 4:18 PM, Matthew Malcomson wrote:
> 
> Hello.
> 
> I've just sent few comments that are related to the v3 of the patch set.
> Based on the HWASAN (limited) knowledge the patch seems reasonable to me.
> I haven't looked much at the newly introduced RTL-hooks.
> But these seems to me isolated to the aarch64 port.
> 
> I can also verify that the patchset works on my aarch64 linux machine and
> hwasan.exp and asan.exp tests succeed.
> 
>> I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory 
>> tagging,
>> since I'm not sure the way I found to implement this would be 
>> acceptable.  The
>> inlined patch below works but it requires a special declaration 
>> instead of just
>> an ~#include~.
> 
> Knowing that, I would not bother with the printing of HWASAN_MARK.
> 
> Thanks for the series,
> Martin

[PING] Re: [Patch 0/X] HWASAN v3

2020-01-06 Thread Matthew Malcomson

Ping


On 17/12/2019 14:11, Matthew Malcomson wrote:
> I've noticed a few minor problems with this patch series after I sent it
> out (mostly testcase stuff, one documentation tidy-up, but also that one
> patch didn't bootstrap due to something fixed in a later patch).
> 
> I also rely on a documentation change that isn't part of the series.
> 
> I figure I should make this easy on anyone that wants to try the patch
> series out, so I'm attaching a compressed tarfile containing the entire
> patch series plus the additional documentation patch so it can all be
> applied at once with `git apply *`.
> 
> It's attached.
> 
> Matthew.
> 
> 
> 
> On 12/12/2019 15:18, Matthew Malcomson wrote:
>> Hello,
>>
>> I've gone through the suggestions Martin made and implemented  the ones I 
>> think
>> I can implement for GCC10.
>>
>> The two functionality changes in this version are:
>> Added the --param's hwasan-instrument-reads, hwasan-instrument-writes,
>> hwasan-instrument-allocas, hwasan-memintrin, options.  I.e. Those that asan 
>> has
>> and that make sense for hwasan.
>>
>> Avoided HWASAN_STACK_BACKGROUND in hwasan_increment_tag when using a
>> deterministic tagging approach.
>>
>>
>> There are a lot of extra comments and tests.
>>
>>
>> Bootstrapped and regtested on x86_64 and AArch64.
>> Bootstrapped with `--with-build-config=bootstrap-hwasan` on AArch64 and 
>> hwasan
>> features tested there.
>> Built the linux kernel using this feature and ran the test_kasan.ko testing 
>> to
>> check the this works for the kernel.
>> (NOTE: I actually did all the above testing before a search and replace of
>> `memory_tagging_p` for `hwasan_sanitize_p` and fixing a typo in the
>> `hwasan-instrument-allocas` parameter name, I will run all the tests again
>> before committing but figure I'll send this out now since I fully expect the
>> tests to still pass).
>>
>>
>> I noticed one extra testsuite failure from those mentioned in the previous
>> version emails: g++.dg/cpp2a/ucn2.C.
>> I believe this is HWASAN correctly catching a problem in the compiler.
>> I've logged the issue here 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92919 .
>>
>>
>> I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory tagging,
>> since I'm not sure the way I found to implement this would be acceptable.  
>> The
>> inlined patch below works but it requires a special declaration instead of 
>> just
>> an ~#include~.
>>
>>
>> diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
>> index a1bc081..d81eb12 100644
>> --- a/gcc/internal-fn.h
>> +++ b/gcc/internal-fn.h
>> @@ -101,10 +101,16 @@ extern void init_internal_fns ();
>>
>>extern const char *const internal_fn_name_array[];
>>
>> +
>> +extern bool hwasan_sanitize_p (void);
>>static inline const char *
>>internal_fn_name (enum internal_fn fn)
>>{
>> -  return internal_fn_name_array[(int) fn];
>> +  const char *ret = internal_fn_name_array[(int) fn];
>> +  if (! strcmp (ret, "ASAN_MARK")
>> +  && hwasan_sanitize_p ())
>> +return "HWASAN_MARK";
>> +  return ret;
>>}
>>
>>extern internal_fn lookup_internal_fn (const char *);
>>
>>
>> Entire patch series attached to cover letter.
>>

Re: [Patch 0/X] HWASAN v3

2019-12-17 Thread Matthew Malcomson

I've noticed a few minor problems with this patch series after I sent it 
out (mostly testcase stuff, one documentation tidy-up, but also that one 
patch didn't bootstrap due to something fixed in a later patch).

I also rely on a documentation change that isn't part of the series.

I figure I should make this easy on anyone that wants to try the patch 
series out, so I'm attaching a compressed tarfile containing the entire 
patch series plus the additional documentation patch so it can all be 
applied at once with `git apply *`.

It's attached.

Matthew.



On 12/12/2019 15:18, Matthew Malcomson wrote:
> Hello,
> 
> I've gone through the suggestions Martin made and implemented  the ones I 
> think
> I can implement for GCC10.
> 
> The two functionality changes in this version are:
> Added the --param's hwasan-instrument-reads, hwasan-instrument-writes,
> hwasan-instrument-allocas, hwasan-memintrin, options.  I.e. Those that asan 
> has
> and that make sense for hwasan.
> 
> Avoided HWASAN_STACK_BACKGROUND in hwasan_increment_tag when using a
> deterministic tagging approach.
> 
> 
> There are a lot of extra comments and tests.
> 
> 
> Bootstrapped and regtested on x86_64 and AArch64.
> Bootstrapped with `--with-build-config=bootstrap-hwasan` on AArch64 and hwasan
> features tested there.
> Built the linux kernel using this feature and ran the test_kasan.ko testing to
> check the this works for the kernel.
> (NOTE: I actually did all the above testing before a search and replace of
> `memory_tagging_p` for `hwasan_sanitize_p` and fixing a typo in the
> `hwasan-instrument-allocas` parameter name, I will run all the tests again
> before committing but figure I'll send this out now since I fully expect the
> tests to still pass).
> 
> 
> I noticed one extra testsuite failure from those mentioned in the previous
> version emails: g++.dg/cpp2a/ucn2.C.
> I believe this is HWASAN correctly catching a problem in the compiler.
> I've logged the issue here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92919 
> .
> 
> 
> I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory tagging,
> since I'm not sure the way I found to implement this would be acceptable.  The
> inlined patch below works but it requires a special declaration instead of 
> just
> an ~#include~.
> 
> 
> diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
> index a1bc081..d81eb12 100644
> --- a/gcc/internal-fn.h
> +++ b/gcc/internal-fn.h
> @@ -101,10 +101,16 @@ extern void init_internal_fns ();
>   
>   extern const char *const internal_fn_name_array[];
>   
> +
> +extern bool hwasan_sanitize_p (void);
>   static inline const char *
>   internal_fn_name (enum internal_fn fn)
>   {
> -  return internal_fn_name_array[(int) fn];
> +  const char *ret = internal_fn_name_array[(int) fn];
> +  if (! strcmp (ret, "ASAN_MARK")
> +  && hwasan_sanitize_p ())
> +return "HWASAN_MARK";
> +  return ret;
>   }
>   
>   extern internal_fn lookup_internal_fn (const char *);
> 
> 
> Entire patch series attached to cover letter.
> 



all-patches.tar.gz
Description: all-patches.tar.gz

Re: [PATCH 7/X] [libsanitizer] Add tests

2019-12-16 Thread Matthew Malcomson

I just remembered that the run tests all fail on cross builds.
This patch includes the effective target check to determine whether a hwasan
binary can be run on a given target.

We then add that target requirement to all tests which need to run.



Adding hwasan tests.

Only interesting thing here is that we have to make sure the tagging mechanism
is deterministic to avoid flaky tests.

gcc/testsuite/ChangeLog:

2019-12-16  Matthew Malcomson  

* c-c++-common/hwasan/aligned-alloc.c: New test.
* c-c++-common/hwasan/alloca-array-accessible.c: New test.
* c-c++-common/hwasan/alloca-gets-different-tag.c: New test.
* c-c++-common/hwasan/alloca-outside-caught.c: New test.
* c-c++-common/hwasan/arguments.c: New test.
* c-c++-common/hwasan/arguments-1.c: New test.
* c-c++-common/hwasan/arguments-2.c: New test.
* c-c++-common/hwasan/arguments-3.c: New test.
* c-c++-common/hwasan/asan-pr63316.c: New test.
* c-c++-common/hwasan/asan-pr70541.c: New test.
* c-c++-common/hwasan/asan-pr78106.c: New test.
* c-c++-common/hwasan/asan-pr79944.c: New test.
* c-c++-common/hwasan/asan-rlimit-mmap-test-1.c: New test.
* c-c++-common/hwasan/bitfield-1.c: New test.
* c-c++-common/hwasan/bitfield-2.c: New test.
* c-c++-common/hwasan/builtin-special-handling.c: New test.
* c-c++-common/hwasan/check-interface.c: New test.
* c-c++-common/hwasan/halt_on_error-1.c: New test.
* c-c++-common/hwasan/heap-overflow.c: New test.
* c-c++-common/hwasan/hwasan-poison-optimisation.c: New test.
* c-c++-common/hwasan/hwasan-thread-access-parent.c: New test.
* c-c++-common/hwasan/hwasan-thread-basic-failure.c: New test.
* c-c++-common/hwasan/hwasan-thread-clears-stack.c: New test.
* c-c++-common/hwasan/hwasan-thread-success.c: New test.
* c-c++-common/hwasan/kernel-defaults.c: New test.
* c-c++-common/hwasan/large-aligned-0.c: New test.
* c-c++-common/hwasan/large-aligned-1.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-0.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-1.c: New test.
* c-c++-common/hwasan/macro-definition.c: New test.
* c-c++-common/hwasan/no-sanitize-attribute.c: New test.
* c-c++-common/hwasan/param-instrument-reads-and-writes.c: New test.
* c-c++-common/hwasan/param-instrument-reads.c: New test.
* c-c++-common/hwasan/param-instrument-writes.c: New test.
* c-c++-common/hwasan/param-memintrin.c: New test.
* c-c++-common/hwasan/random-frame-tag.c: New test.
* c-c++-common/hwasan/sanity-check-pure-c.c: New test.
* c-c++-common/hwasan/setjmp-longjmp-0.c: New test.
* c-c++-common/hwasan/setjmp-longjmp-1.c: New test.
* c-c++-common/hwasan/stack-tagging-basic-0.c: New test.
* c-c++-common/hwasan/stack-tagging-basic-1.c: New test.
* c-c++-common/hwasan/stack-tagging-disable.c: New test.
* c-c++-common/hwasan/unprotected-allocas-0.c: New test.
* c-c++-common/hwasan/unprotected-allocas-1.c: New test.
* c-c++-common/hwasan/use-after-free.c: New test.
* c-c++-common/hwasan/vararray-outside-caught.c: New test.
* c-c++-common/hwasan/vararray-stack-restore-correct.c: New test.
* c-c++-common/hwasan/very-large-objects.c: New test.
* g++.dg/hwasan/hwasan.exp: New file.
* g++.dg/hwasan/rvo-handled.C: New test.
* gcc.dg/hwasan/hwasan.exp: New file.
* gcc.dg/hwasan/nested-functions-0.c: New test.
* gcc.dg/hwasan/nested-functions-1.c: New test.
* gcc.dg/hwasan/nested-functions-2.c: New test.
* lib/hwasan-dg.exp: New file.



### Attachment also inlined for ease of reply###


diff --git a/gcc/testsuite/c-c++-common/hwasan/aligned-alloc.c 
b/gcc/testsuite/c-c++-common/hwasan/aligned-alloc.c
new file mode 100644
index 
..d38b1f3f62d97dc3f5c3882137c370700fd4f9a0
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/hwasan/aligned-alloc.c
@@ -0,0 +1,16 @@
+/* { dg-do run } */
+/* { dg-require-effective-target hwaddress_exec } */
+/* { dg-shouldfail "hwasan" } */
+/* This program fails at runtime in the libhwasan library.
+   The allocator can't handle the requested invalid alignment.  */
+
+int
+main ()
+{
+  void *p = __builtin_aligned_alloc (17, 100);
+  if (((unsigned long long)p & 0x10) == 0)
+return 0;
+  return 1;
+}
+
+/* { dg-output "HWAddressSanitizer: invalid alignment requested in 
aligned_alloc: 17" } */
diff --git a/gcc/testsuite/c-c++-common/hwasan/alloca-array-accessible.c 
b/gcc/testsuite/c-c++-common/hwasan/alloca-array-accessible.c
new file mode 100644
index 
..5e4c168f77efac5c9335768fb5b76fb3752783e9
--- /dev/null
+++ b/gcc/testsuite/c-c++-co

[PATCH 3/X] [libsanitizer] Add option to bootstrap using HWASAN

Updated to include documentation ChangeLog and apply cleanly on the
bootstrap-asan documentation I mentioned in a different email.



This is an analogous option to --bootstrap-asan to configure.  It allows
bootstrapping GCC using HWASAN.

For the same reasons as for ASAN we have to avoid using the HWASAN
sanitizer when compiling libiberty and the lto-plugin.

Also add a function to query whether -fsanitize=hwaddress has been
passed.

ChangeLog:

2019-08-29  Matthew Malcomson  

* configure: Regenerate.
* configure.ac: Add --bootstrap-hwasan option.

config/ChangeLog:

2019-12-12  Matthew Malcomson  

* bootstrap-hwasan.mk: New file.

gcc/ChangeLog:

2019-12-12  Matthew Malcomson  

* doc/install.texi: Document new option.

libiberty/ChangeLog:

2019-12-12  Matthew Malcomson  

* configure: Regenerate.
* configure.ac: Avoid using sanitizer.

lto-plugin/ChangeLog:

2019-12-12  Matthew Malcomson  

* Makefile.am: Avoid using sanitizer.
* Makefile.in: Regenerate.



### Attachment also inlined for ease of reply###


diff --git a/config/bootstrap-hwasan.mk b/config/bootstrap-hwasan.mk
new file mode 100644
index 
..4f60bed3fd6e98b47a3a38aea6eba2a7c320da25
--- /dev/null
+++ b/config/bootstrap-hwasan.mk
@@ -0,0 +1,8 @@
+# This option enables -fsanitize=hwaddress for stage2 and stage3.
+
+STAGE2_CFLAGS += -fsanitize=hwaddress
+STAGE3_CFLAGS += -fsanitize=hwaddress
+POSTSTAGE1_LDFLAGS += -fsanitize=hwaddress -static-libhwasan \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/ \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/ \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/.libs
diff --git a/configure b/configure
index 
f6027397cedd5c28cd821a0bc7629d6537393ecf..0b502f2784927d72eab248741633305296bf09e7
 100755
--- a/configure
+++ b/configure
@@ -7270,7 +7270,7 @@ fi
 # or bootstrap-ubsan, bootstrap it.
 if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; then
   case "$BUILD_CONFIG" in
-*bootstrap-asan* | *bootstrap-ubsan* )
+*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* )
   bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer,
   bootstrap_fixincludes=yes
   ;;
diff --git a/configure.ac b/configure.ac
index 
50e2fa135b10486170f68ffa162dcbeee2adff90..f793e14e0316e26d05ba72696e8c7c10641c7c96
 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2775,7 +2775,7 @@ fi
 # or bootstrap-ubsan, bootstrap it.
 if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; then
   case "$BUILD_CONFIG" in
-*bootstrap-asan* | *bootstrap-ubsan* )
+*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* )
   bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer,
   bootstrap_fixincludes=yes
   ;;
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 
c5937c94b7cebd3d0f1d7d551a7b593c128e7673..37cb8ed15671205567ee20286e21fb904b3e3c9f
 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2676,6 +2676,12 @@ Some examples of build configurations designed for 
developers of GCC are:
 @item @samp{bootstrap-asan}
 Compiles GCC itself using Address Sanitization in order to catch invalid memory
 accesses within the GCC code.
+
+@item @samp{bootstrap-hwasan}
+Compiles GCC itself using HWAddress Sanitization in order to catch invalid
+memory accesses within the GCC code.  This option is only available on AArch64
+targets with a very recent linux kernel (5.4 or later).
+
 @end table
 
 @section Building a cross compiler
diff --git a/libiberty/configure b/libiberty/configure
index 
7a34dabec32b0b383bd33f07811757335f4dd39c..cb2dd4ff5295598343cc18b3a79a86a778f2261d
 100755
--- a/libiberty/configure
+++ b/libiberty/configure
@@ -5261,6 +5261,7 @@ fi
 NOASANFLAG=
 case " ${CFLAGS} " in
   *\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;;
+  *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;;
 esac
 
 
diff --git a/libiberty/configure.ac b/libiberty/configure.ac
index 
f1ce76010c9acde79c5dc46686a78b2e2f19244e..043237628b79cbf37d07359b59c5ffe17a7a22ef
 100644
--- a/libiberty/configure.ac
+++ b/libiberty/configure.ac
@@ -240,6 +240,7 @@ AC_SUBST(PICFLAG)
 NOASANFLAG=
 case " ${CFLAGS} " in
   *\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;;
+  *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;;
 esac
 AC_SUBST(NOASANFLAG)
 
diff --git a/lto-plugin/Makefile.am b/lto-plugin/Makefile.am
index 
28dc21014b2e86988fa88adabd63ce6092e18e02..34aa397d785e3cc9b6975de460d065900364c3ff
 100644
--- a/lto-plugin/Makefile.am
+++ b/lto-plugin/Makefile.am
@@ -11,8 +11,8 @@ AM_CPPFLAGS = -I$(top_srcdir)/../include $(DEFS)
 AM_CFLAGS = @ac_lto_plugin_warn_cflags@
 AM_LDFLAGS = @ac_lto_plugin_

Document --with-build-config=bootstrap-asan option.

Document how to configure using asan (bootstrap-asan option).

Since I'm adding a bootstrap-hwasan option and documenting that, 
bootstrap-asan should also be documented.

(As Martin pointed out in the hwasan reviews).



(FYI I now notice that my hwasan patch 3/X needs this to apply cleanly).



gcc/ChangeLog:

2019-12-12  Matthew Malcomson  

* doc/install.texi: Document bootstrap-asan configuration option.





###


commit 6e0bbe33120ad3f92e4266ecfe4ecb8ce8958865
Author: Matthew Malcomson 
Date:   Wed Nov 6 12:48:08 2019 +

 Document bootstrap-asan compilation option

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 80b4781..c5937c9 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2670,6 +2670,14 @@ the build tree.

  @end table

+Some examples of build configurations designed for developers of GCC are:
+
+@table @asis
+@item @samp{bootstrap-asan}
+Compiles GCC itself using Address Sanitization in order to catch 
invalid memory
+accesses within the GCC code.
+@end table
+
  @section Building a cross compiler

  When building a cross compiler, it is not generally possible to do a

[PATCH 7/X] [libsanitizer] Add tests

Adding hwasan tests.

Only interesting thing here is that we have to make sure the tagging mechanism
is deterministic to avoid flaky tests.

gcc/testsuite/ChangeLog:

2019-12-12  Matthew Malcomson  

* c-c++-common/hwasan/aligned-alloc.c: New test.
* c-c++-common/hwasan/alloca-array-accessible.c: New test.
* c-c++-common/hwasan/alloca-gets-different-tag.c: New test.
* c-c++-common/hwasan/alloca-outside-caught.c: New test.
* c-c++-common/hwasan/arguments.c: New test.
* c-c++-common/hwasan/arguments-1.c: New test.
* c-c++-common/hwasan/arguments-2.c: New test.
* c-c++-common/hwasan/arguments-3.c: New test.
* c-c++-common/hwasan/asan-pr63316.c: New test.
* c-c++-common/hwasan/asan-pr70541.c: New test.
* c-c++-common/hwasan/asan-pr78106.c: New test.
* c-c++-common/hwasan/asan-pr79944.c: New test.
* c-c++-common/hwasan/asan-rlimit-mmap-test-1.c: New test.
* c-c++-common/hwasan/bitfield-1.c: New test.
* c-c++-common/hwasan/bitfield-2.c: New test.
* c-c++-common/hwasan/builtin-special-handling.c: New test.
* c-c++-common/hwasan/check-interface.c: New test.
* c-c++-common/hwasan/halt_on_error-1.c: New test.
* c-c++-common/hwasan/heap-overflow.c: New test.
* c-c++-common/hwasan/hwasan-poison-optimisation.c: New test.
* c-c++-common/hwasan/hwasan-thread-access-parent.c: New test.
* c-c++-common/hwasan/hwasan-thread-basic-failure.c: New test.
* c-c++-common/hwasan/hwasan-thread-clears-stack.c: New test.
* c-c++-common/hwasan/hwasan-thread-success.c: New test.
* c-c++-common/hwasan/kernel-defaults.c: New test.
* c-c++-common/hwasan/large-aligned-0.c: New test.
* c-c++-common/hwasan/large-aligned-1.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-0.c: New test.
* c-c++-common/hwasan/large-aligned-untagging-1.c: New test.
* c-c++-common/hwasan/macro-definition.c: New test.
* c-c++-common/hwasan/no-sanitize-attribute.c: New test.
* c-c++-common/hwasan/param-instrument-reads-and-writes.c: New test.
* c-c++-common/hwasan/param-instrument-reads.c: New test.
* c-c++-common/hwasan/param-instrument-writes.c: New test.
* c-c++-common/hwasan/param-memintrin.c: New test.
* c-c++-common/hwasan/random-frame-tag.c: New test.
* c-c++-common/hwasan/sanity-check-pure-c.c: New test.
* c-c++-common/hwasan/setjmp-longjmp-0.c: New test.
* c-c++-common/hwasan/setjmp-longjmp-1.c: New test.
* c-c++-common/hwasan/stack-tagging-basic-0.c: New test.
* c-c++-common/hwasan/stack-tagging-basic-1.c: New test.
* c-c++-common/hwasan/stack-tagging-disable.c: New test.
* c-c++-common/hwasan/unprotected-allocas-0.c: New test.
* c-c++-common/hwasan/unprotected-allocas-1.c: New test.
* c-c++-common/hwasan/use-after-free.c: New test.
* c-c++-common/hwasan/vararray-outside-caught.c: New test.
* c-c++-common/hwasan/vararray-stack-restore-correct.c: New test.
* c-c++-common/hwasan/very-large-objects.c: New test.
* g++.dg/hwasan/hwasan.exp: New file.
* g++.dg/hwasan/rvo-handled.C: New test.
* g++.dg/hwasan/try-catch-0.cpp: New test.
* g++.dg/hwasan/try-catch-1.cpp: New test.
* gcc.dg/hwasan/hwasan.exp: New file.
* gcc.dg/hwasan/nested-functions-0.c: New test.
* gcc.dg/hwasan/nested-functions-1.c: New test.
* gcc.dg/hwasan/nested-functions-2.c: New test.
* lib/hwasan-dg.exp: New file.



### Attachment also inlined for ease of reply###


diff --git a/gcc/testsuite/c-c++-common/hwasan/aligned-alloc.c 
b/gcc/testsuite/c-c++-common/hwasan/aligned-alloc.c
new file mode 100644
index 
..e5837a9f0766fc4d91fabc1a3c2e0017f845dc46
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/hwasan/aligned-alloc.c
@@ -0,0 +1,15 @@
+/* { dg-do run } */
+/* { dg-shouldfail "hwasan" } */
+/* This program fails at runtime in the libhwasan library.
+   The allocator can't handle the requested invalid alignment.  */
+
+int
+main ()
+{
+  void *p = __builtin_aligned_alloc (17, 100);
+  if (((unsigned long long)p & 0x10) == 0)
+return 0;
+  return 1;
+}
+
+/* { dg-output "HWAddressSanitizer: invalid alignment requested in 
aligned_alloc: 17" } */
diff --git a/gcc/testsuite/c-c++-common/hwasan/alloca-array-accessible.c 
b/gcc/testsuite/c-c++-common/hwasan/alloca-array-accessible.c
new file mode 100644
index 
..c6b4d264b8c477b52b859a0411aabb0fc8dde352
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/hwasan/alloca-array-accessible.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+
+#define alloca __builtin_alloca
+
+int __attribute__ ((noinline))
+using_alloca (int num)
+{
+  int retval = 0;
+  int *big_a

[PATCH 6/X] [libsanitizer] Add hwasan pass and associated gimple changes

There are four main features to this change:

1) Check pointer tags match address tags.

In the new `hwasan` pass we put HWASAN_CHECK internal functions around
all memory accesses, to check that tags in the pointer being used match
the tag stored in shadow memory for the memory region being used.

These internal functions are expanded into actual checks in the sanopt
pass that happens just before expansion into RTL.

We use the same mechanism that currently inserts ASAN_CHECK internal
functions to insert the new HWASAN_CHECK functions.

2) Instrument known builtin function calls.

Handle all builtin functions that we know use memory accesses.
This commit uses the machinery added for ASAN to identify builtin
functions that access memory.

The main differences between the approaches for HWASAN and ASAN are:
 - libhwasan intercepts much less builtin functions.
 - Alloca needs to be transformed differently (instead of adding
   redzones it needs to tag shadow memory and return a tagged pointer).
 - stack_restore needs to untag the shadow stack between the current
   position and where it's going.
 - `noreturn` functions can not be handled by simply unpoisoning the
   entire shadow stack -- there is no "always valid" tag.
   (exceptions and things such as longjmp need to be handled in a
   different way).

For hardware implemented checking (such as AArch64's memory tagging
extension) alloca and stack_restore will need to be handled by hooks in
the backend rather than transformation at the gimple level.  This will
allow architecture specific handling of such stack modifications.

3) Introduce HWASAN block-scope poisoning

Here we use exactly the same mechanism as ASAN_MARK to poison/unpoison
variables on entry/exit of a block.

In order to simply use the exact same machinery we're using the same
internal functions until the SANOPT pass.  This means that all handling
of ASAN_MARK is the same.
This has the negative that the naming may be a little confusing, but a
positive that handling of the internal function doesn't have to be
duplicated for a function that behaves exactly the same but has a
different name.

gcc/ChangeLog:

2019-12-12  Matthew Malcomson  

* asan.c (handle_builtin_stack_restore): Account for HWASAN.
(handle_builtin_alloca): Account for HWASAN.
(get_mem_refs_of_builtin_call): Special case strlen for HWASAN.
(report_error_func): Assert not HWASAN.
(build_check_stmt): Make HWASAN_CHECK instead of ASAN_CHECK.
(instrument_derefs): HWASAN does not tag globals.
(maybe_instrument_call): Don't instrument `noreturn` functions.
(initialize_sanitizer_builtins): Add new type.
(asan_expand_mark_ifn): Account for HWASAN.
(asan_expand_check_ifn): Assert never called by HWASAN.
(asan_expand_poison_ifn): Account for HWASAN.
(hwasan_instrument): New.
(hwasan_base): New.
(hwasan_emit_untag_frame): Free block-scope-var hash map.
(hwasan_check_func): New.
(hwasan_expand_check_ifn): New.
(hwasan_expand_mark_ifn): New.
(gate_hwasan): New.
(class pass_hwasan): New.
(make_pass_hwasan): New.
(class pass_hwasan_O0): New.
(make_pass_hwasan_O0): New.
* asan.h (hwasan_base): New decl.
(hwasan_expand_check_ifn): New decl.
(hwasan_expand_mark_ifn): New decl.
(gate_hwasan): New decl.
(enum hwasan_mark_flags): New.
(asan_intercepted_p): Always false for hwasan.
(asan_sanitize_use_after_scope): Account for HWASAN.
* builtin-types.def (BT_FN_PTR_CONST_PTR_UINT8): New.
* gimple-pretty-print.c (dump_gimple_call_args): Account for
HWASAN.
* gimplify.c (asan_poison_variable): Account for HWASAN.
(gimplify_function_tree): Remove requirement of
SANITIZE_ADDRESS, requiring asan or hwasan is accounted for in
`asan_sanitize_use_after_scope`.
* internal-fn.c (expand_HWASAN_CHECK): New.
(expand_HWASAN_CHOOSE_TAG): New.
(expand_HWASAN_MARK): New.
(expand_HWASAN_ALLOCA_UNPOISON): New.
* internal-fn.def (HWASAN_CHOOSE_TAG): New.
(HWASAN_CHECK): New.
(HWASAN_MARK): New.
(HWASAN_ALLOCA_UNPOISON): New.
* passes.def: Add hwasan and hwasan_O0 passes.
* sanitizer.def (BUILT_IN_HWASAN_LOAD1): New.
(BUILT_IN_HWASAN_LOAD2): New.
(BUILT_IN_HWASAN_LOAD4): New.
(BUILT_IN_HWASAN_LOAD8): New.
(BUILT_IN_HWASAN_LOAD16): New.
(BUILT_IN_HWASAN_LOADN): New.
(BUILT_IN_HWASAN_STORE1): New.
(BUILT_IN_HWASAN_STORE2): New.
(BUILT_IN_HWASAN_STORE4): New.
(BUILT_IN_HWASAN_STORE8): New.
(BUILT_IN_HWASAN_STORE16): New.
(BUILT_IN_HWASAN_STOREN): New.
(BUILT_IN_HWASAN_LOAD1_NOABORT): New.
(BUILT_IN_HWASAN_LOAD2_NOABORT): New.
(BUILT_IN_HWASAN_LOAD4_NOABORT): New.
(BUILT_IN_HWASAN_LOA

[PATCH 5/X] [libsanitizer][mid-end] Introduce stack variable handling for HWASAN

Handling stack variables has three features.

1) Ensure HWASAN required alignment for stack variables

When tagging shadow memory, we need to ensure that each tag granule is
only used by one variable at a time.

This is done by ensuring that each tagged variable is aligned to the tag
granule representation size and also ensure that the end of each
variable as an alignment boundary between the end and the start of any
other data stored on the stack.

This patch ensures that by adding alignment requirements in
`align_local_variable` and forcing all stack variable allocation to be
deferred so that `expand_stack_vars` can ensure the stack pointer is
aligned before allocating any variable for the current frame.

2) Put tags into each stack variable pointer

Make sure that every pointer to a stack variable includes a tag of some
sort on it.

The way tagging works is:
  1) For every new stack frame, a random tag is generated.
  2) A base register is formed from the stack pointer value and this
 random tag.
  3) References to stack variables are now formed with RTL describing an
 offset from this base in both tag and value.

The random tag generation is handled by a backend hook.  This hook
decides whether to introduce a random tag or use the stack background
based on the parameter hwasan-random-frame-tag.  Using the stack
background is necessary for testing and bootstrap.  It is necessary
during bootstrap to avoid breaking the `configure` test program for
determining stack direction.

Using the stack background means that every stack frame has the initial
tag of zero and variables are tagged with incrementing tags from 1,
which also makes debugging a bit easier.

The tag offsets are also handled by a backend hook.

This patch also adds some macros defining how the HWASAN shadow memory
is stored and how a tag is stored in a pointer.

3) For each stack variable, tag and untag the shadow stack on function
   prologue and epilogue.

On entry to each function we tag the relevant shadow stack region for
each stack variable the tag to match the tag added to each pointer for
that variable.

This is the first patch where we use the HWASAN shadow space, so we need
to add in the libhwasan initialisation code that creates this shadow
memory region into the binary we produce.  This instrumentation is done
in `compile_file`.

When exiting a function we need to ensure the shadow stack for this
function has no remaining tag.  Without clearing the shadow stack area
for this stack frame, later function calls could get false positives
when those later function calls check untagged areas (such as parameters
passed on the stack) against a shadow stack area with left-over tag.

Hence we ensure that the entire stack frame is cleared on function exit.

gcc/ChangeLog:

2019-12-12  Matthew Malcomson  

* asan.c (hwasan_record_base): New function.
(hwasan_emit_untag_frame): New.
(hwasan_increment_tag): New function.
(hwasan_with_tag): New function.
(hwasan_tag_init): New function.
(initialize_sanitizer_builtins): Define new builtins.
(ATTR_NOTHROW_LIST): New macro.
(hwasan_current_tag): New.
(hwasan_emit_prologue): New.
(hwasan_create_untagged_base): New.
(hwasan_finish_file): New.
(hwasan_sanitize_stack_p): New.
(hwasan_sanitize_p): New.
* asan.h (hwasan_record_base): New declaration.
(hwasan_emit_untag_frame): New.
(hwasan_increment_tag): New declaration.
(hwasan_with_tag): New declaration.
(hwasan_sanitize_stack_p): New declaration.
(hwasan_tag_init): New declaration.
(hwasan_sanitize_p): New declaration.
(HWASAN_TAG_SIZE): New macro.
(HWASAN_TAG_GRANULE_SIZE):New macro.
(HWASAN_SHIFT):New macro.
(HWASAN_SHIFT_RTX):New macro.
(HWASAN_STACK_BACKGROUND):New macro.
(hwasan_finish_file): New.
(hwasan_current_tag): New.
(hwasan_create_untagged_base): New.
(hwasan_emit_prologue): New.
* cfgexpand.c (struct stack_vars_data): Add information to
record hwasan variable stack offsets.
(expand_stack_vars): Ensure variables are offset from a tagged
base. Record offsets for hwasan. Ensure alignment.
(expand_used_vars): Call function to emit prologue, and get
untagging instructions for function exit.
(align_local_variable): Ensure alignment.
(defer_stack_allocation): Ensure all variables are deferred so
they can be handled by `expand_stack_vars`.
(expand_one_stack_var_at): Account for tags in
variables when using HWASAN.
(expand_one_stack_var_1): Pass new argument to
expand_one_stack_var_at.
(init_vars_expansion): Initialise hwasan internal variables when
starting variable expansion.
* doc/tm.texi (TARGET_MEMTAG_GENTAG): Document.
* doc/tm.texi.in (TARGET_MEMTAG_GENTAG): Document

[PATCH 1/X] [libsanitizer] Tie the hwasan library into our build system

This patch tries to tie libhwasan into the GCC build system in the same way
that the other sanitizer runtime libraries are handled.

libsanitizer/ChangeLog:

2019-12-12  Matthew Malcomson  

* Makefile.am:  Build libhwasan.
* Makefile.in:  Build libhwasan.
* asan/Makefile.in:  Build libhwasan.
* configure:  Build libhwasan.
* configure.ac:  Build libhwasan.
* hwasan/Makefile.am: New file.
* hwasan/Makefile.in: New file.
* hwasan/libtool-version: New file.
* interception/Makefile.in: Build libhwasan.
* libbacktrace/Makefile.in: Build libhwasan.
* libsanitizer.spec.in: Build libhwasan.
* lsan/Makefile.in: Build libhwasan.
* merge.sh: Build libhwasan.
* sanitizer_common/Makefile.in: Build libhwasan.
* tsan/Makefile.in: Build libhwasan.
* ubsan/Makefile.in: Build libhwasan.



### Attachment also inlined for ease of reply###


diff --git a/libsanitizer/Makefile.am b/libsanitizer/Makefile.am
index 
65ed1e712378ef453f820f86c4d3221f9dee5f2c..2a7e8e1debe838719db0f0fad218b2543cc3111b
 100644
--- a/libsanitizer/Makefile.am
+++ b/libsanitizer/Makefile.am
@@ -14,11 +14,12 @@ endif
 if LIBBACKTRACE_SUPPORTED
 SUBDIRS += libbacktrace
 endif
-SUBDIRS += lsan asan ubsan
+SUBDIRS += lsan asan ubsan hwasan
 nodist_saninclude_HEADERS += \
   include/sanitizer/lsan_interface.h \
   include/sanitizer/asan_interface.h \
-  include/sanitizer/tsan_interface.h
+  include/sanitizer/tsan_interface.h \
+  include/sanitizer/hwasan_interface.h
 if TSAN_SUPPORTED
 SUBDIRS += tsan
 endif
diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in
index 
0d789b3a59d21ea2e5a23057ca3afe15425feec4..36aa952af7e04bc0e4fb94cdcd584d539193d781
 100644
--- a/libsanitizer/Makefile.in
+++ b/libsanitizer/Makefile.in
@@ -92,7 +92,8 @@ target_triplet = @target@
 @SANITIZER_SUPPORTED_TRUE@am__append_1 = 
include/sanitizer/common_interface_defs.h \
 @SANITIZER_SUPPORTED_TRUE@ include/sanitizer/lsan_interface.h \
 @SANITIZER_SUPPORTED_TRUE@ include/sanitizer/asan_interface.h \
-@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/tsan_interface.h
+@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/tsan_interface.h \
+@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/hwasan_interface.h
 @SANITIZER_SUPPORTED_TRUE@@USING_MAC_INTERPOSE_FALSE@am__append_2 = 
interception
 @LIBBACKTRACE_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_3 = 
libbacktrace
 @SANITIZER_SUPPORTED_TRUE@@TSAN_SUPPORTED_TRUE@am__append_4 = tsan
@@ -206,7 +207,7 @@ ETAGS = etags
 CTAGS = ctags
 CSCOPE = cscope
 DIST_SUBDIRS = sanitizer_common interception libbacktrace lsan asan \
-   ubsan tsan
+   ubsan hwasan tsan
 ACLOCAL = @ACLOCAL@
 ALLOC_FILE = @ALLOC_FILE@
 AMTAR = @AMTAR@
@@ -328,6 +329,7 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libasan = @link_libasan@
+link_libhwasan = @link_libhwasan@
 link_liblsan = @link_liblsan@
 link_libtsan = @link_libtsan@
 link_libubsan = @link_libubsan@
@@ -361,7 +363,7 @@ sanincludedir = 
$(libdir)/gcc/$(target_alias)/$(gcc_version)/include/sanitizer
 nodist_saninclude_HEADERS = $(am__append_1)
 @SANITIZER_SUPPORTED_TRUE@SUBDIRS = sanitizer_common $(am__append_2) \
 @SANITIZER_SUPPORTED_TRUE@ $(am__append_3) lsan asan ubsan \
-@SANITIZER_SUPPORTED_TRUE@ $(am__append_4)
+@SANITIZER_SUPPORTED_TRUE@ hwasan $(am__append_4)
 gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
 
 # Work around what appears to be a GNU make bug handling MAKEFLAGS
diff --git a/libsanitizer/asan/Makefile.in b/libsanitizer/asan/Makefile.in
index 
00b6082da5372efd679ddc230f588bbc58161ef6..76689c3b224b1fb04895ae48829eac4b6784cd84
 100644
--- a/libsanitizer/asan/Makefile.in
+++ b/libsanitizer/asan/Makefile.in
@@ -382,6 +382,7 @@ install_sh = @install_sh@
 libdir = @libdir@
 libexecdir = @libexecdir@
 link_libasan = @link_libasan@
+link_libhwasan = @link_libhwasan@
 link_liblsan = @link_liblsan@
 link_libtsan = @link_libtsan@
 link_libubsan = @link_libubsan@
diff --git a/libsanitizer/configure b/libsanitizer/configure
index 
79b5c1eadb59018bca13a33f19f3494c170365ee..ff72af73e6f77aaf93bf39e6799f896851a377dd
 100755
--- a/libsanitizer/configure
+++ b/libsanitizer/configure
@@ -657,6 +657,7 @@ USING_MAC_INTERPOSE_TRUE
 link_liblsan
 link_libubsan
 link_libtsan
+link_libhwasan
 link_libasan
 LSAN_SUPPORTED_FALSE
 LSAN_SUPPORTED_TRUE
@@ -12334,7 +12335,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12337 "configure"
+#line 12338 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -12440,7 +12441,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12443 "configure"
+#line 12444 "configure"
 #

[Patch 0/X] HWASAN v3

Hello,

I've gone through the suggestions Martin made and implemented  the ones I think
I can implement for GCC10.

The two functionality changes in this version are:
Added the --param's hwasan-instrument-reads, hwasan-instrument-writes,
hwasan-instrument-allocas, hwasan-memintrin, options.  I.e. Those that asan has
and that make sense for hwasan.

Avoided HWASAN_STACK_BACKGROUND in hwasan_increment_tag when using a
deterministic tagging approach.


There are a lot of extra comments and tests.


Bootstrapped and regtested on x86_64 and AArch64.
Bootstrapped with `--with-build-config=bootstrap-hwasan` on AArch64 and hwasan
features tested there.
Built the linux kernel using this feature and ran the test_kasan.ko testing to
check the this works for the kernel.
(NOTE: I actually did all the above testing before a search and replace of
`memory_tagging_p` for `hwasan_sanitize_p` and fixing a typo in the
`hwasan-instrument-allocas` parameter name, I will run all the tests again
before committing but figure I'll send this out now since I fully expect the
tests to still pass).


I noticed one extra testsuite failure from those mentioned in the previous
version emails: g++.dg/cpp2a/ucn2.C.
I believe this is HWASAN correctly catching a problem in the compiler.
I've logged the issue here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92919 .


I haven't gotten ASAN_MARK to print as HWASAN_MARK when using memory tagging,
since I'm not sure the way I found to implement this would be acceptable.  The
inlined patch below works but it requires a special declaration instead of just
an ~#include~.


diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index a1bc081..d81eb12 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -101,10 +101,16 @@ extern void init_internal_fns ();
 
 extern const char *const internal_fn_name_array[];
 
+
+extern bool hwasan_sanitize_p (void);
 static inline const char *
 internal_fn_name (enum internal_fn fn)
 {
-  return internal_fn_name_array[(int) fn];
+  const char *ret = internal_fn_name_array[(int) fn];
+  if (! strcmp (ret, "ASAN_MARK")
+  && hwasan_sanitize_p ())
+return "HWASAN_MARK";
+  return ret;
 }
 
 extern internal_fn lookup_internal_fn (const char *);


Entire patch series attached to cover letter.

all-patches.tar.gz
Description: all-patches.tar.gz

[PATCH 3/X] [libsanitizer] Add option to bootstrap using HWASAN

This is an analogous option to --bootstrap-asan to configure.  It allows
bootstrapping GCC using HWASAN.

For the same reasons as for ASAN we have to avoid using the HWASAN
sanitizer when compiling libiberty and the lto-plugin.

Also add a function to query whether -fsanitize=hwaddress has been
passed.

ChangeLog:

2019-08-29  Matthew Malcomson  

* configure: Regenerate.
* configure.ac: Add --bootstrap-hwasan option.

config/ChangeLog:

2019-12-12  Matthew Malcomson  

* bootstrap-hwasan.mk: New file.

libiberty/ChangeLog:

2019-12-12  Matthew Malcomson  

* configure: Regenerate.
* configure.ac: Avoid using sanitizer.

lto-plugin/ChangeLog:

2019-12-12  Matthew Malcomson  

* Makefile.am: Avoid using sanitizer.
* Makefile.in: Regenerate.



### Attachment also inlined for ease of reply###


diff --git a/config/bootstrap-hwasan.mk b/config/bootstrap-hwasan.mk
new file mode 100644
index 
..4f60bed3fd6e98b47a3a38aea6eba2a7c320da25
--- /dev/null
+++ b/config/bootstrap-hwasan.mk
@@ -0,0 +1,8 @@
+# This option enables -fsanitize=hwaddress for stage2 and stage3.
+
+STAGE2_CFLAGS += -fsanitize=hwaddress
+STAGE3_CFLAGS += -fsanitize=hwaddress
+POSTSTAGE1_LDFLAGS += -fsanitize=hwaddress -static-libhwasan \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/ \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/ \
+ -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/.libs
diff --git a/configure b/configure
index 
aec9186b2b0123d3088b69eb1ee541567654953e..6f71b111bd18ec053180beecf83dd4549e83c2b9
 100755
--- a/configure
+++ b/configure
@@ -7270,7 +7270,7 @@ fi
 # or bootstrap-ubsan, bootstrap it.
 if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; then
   case "$BUILD_CONFIG" in
-*bootstrap-asan* | *bootstrap-ubsan* )
+*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* )
   bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer,
   bootstrap_fixincludes=yes
   ;;
diff --git a/configure.ac b/configure.ac
index 
b8ce2ad20b9d03e42731252a9ec2a8417c13e566..16bfdf164555dad94c789f17b6a63ba1a2e3e9f4
 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2775,7 +2775,7 @@ fi
 # or bootstrap-ubsan, bootstrap it.
 if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; then
   case "$BUILD_CONFIG" in
-*bootstrap-asan* | *bootstrap-ubsan* )
+*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* )
   bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer,
   bootstrap_fixincludes=yes
   ;;
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 
6c9579bfaff955eb43875b404fb7db1a667bf522..da9a8809c3440827ac22ef6936e080820197f4e7
 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -2645,6 +2645,13 @@ Some examples of build configurations designed for 
developers of GCC are:
 Compiles GCC itself using Address Sanitization in order to catch invalid memory
 accesses within the GCC code.
 
+@item @samp{bootstrap-hwasan}
+Compiles GCC itself using HWAddress Sanitization in order to catch invalid
+memory accesses within the GCC code.  This option is only available on AArch64
+targets with a very recent linux kernel (5.4 or later).
+
+@end table
+
 @section Building a cross compiler
 
 When building a cross compiler, it is not generally possible to do a
diff --git a/libiberty/configure b/libiberty/configure
index 
7a34dabec32b0b383bd33f07811757335f4dd39c..cb2dd4ff5295598343cc18b3a79a86a778f2261d
 100755
--- a/libiberty/configure
+++ b/libiberty/configure
@@ -5261,6 +5261,7 @@ fi
 NOASANFLAG=
 case " ${CFLAGS} " in
   *\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;;
+  *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;;
 esac
 
 
diff --git a/libiberty/configure.ac b/libiberty/configure.ac
index 
f1ce76010c9acde79c5dc46686a78b2e2f19244e..043237628b79cbf37d07359b59c5ffe17a7a22ef
 100644
--- a/libiberty/configure.ac
+++ b/libiberty/configure.ac
@@ -240,6 +240,7 @@ AC_SUBST(PICFLAG)
 NOASANFLAG=
 case " ${CFLAGS} " in
   *\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;;
+  *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;;
 esac
 AC_SUBST(NOASANFLAG)
 
diff --git a/lto-plugin/Makefile.am b/lto-plugin/Makefile.am
index 
28dc21014b2e86988fa88adabd63ce6092e18e02..34aa397d785e3cc9b6975de460d065900364c3ff
 100644
--- a/lto-plugin/Makefile.am
+++ b/lto-plugin/Makefile.am
@@ -11,8 +11,8 @@ AM_CPPFLAGS = -I$(top_srcdir)/../include $(DEFS)
 AM_CFLAGS = @ac_lto_plugin_warn_cflags@
 AM_LDFLAGS = @ac_lto_plugin_ldflags@
 AM_LIBTOOLFLAGS = --tag=disable-static
-override CFLAGS := $(filter-out -fsanitize=address,$(CFLAGS))
-override LDFLAGS := $(filter-out -fsanitize=address,$(LDFLAGS))
+override CFLAGS :

[PATCH 4/X] [libsanitizer][options] Add hwasan flags and argument parsing

These flags can't be used at the same time as any of the other
sanitizers.
We add an equivalent flag to -static-libasan in -static-libhwasan to
ensure static linking.

The -fsanitize=kernel-hwaddress option is for compiling targeting the
kernel.  This flag has defaults that allow compiling KASAN with tags as
it is currently implemented.
These defaults are that we do not sanitize variables on the stack and
always recover from a detected bug.
Stack tagging in the kernel is a future aim, stack instrumentation has
not yet been enabled for the kernel for clang either
(https://lists.infradead.org/pipermail/linux-arm-kernel/2019-October/687121.html).

We introduce a backend hook `targetm.memtag.can_tag_addresses` that
indicates to the mid-end whether a target has a feature like AArch64 TBI
where the top byte of an address is ignored.
Without this feature hwasan sanitization is not done.

gcc/ChangeLog:

2019-12-12  Matthew Malcomson  

* common.opt (flag_sanitize_recover): Default for kernel
hwaddress.
(static-libhwasan): New cli option.
* config/aarch64/aarch64.c (aarch64_can_tag_addresses): New.
(TARGET_MEMTAG_CAN_TAG_ADDRESSES): New.
* config/gnu-user.h (LIBHWASAN_EARLY_SPEC): hwasan equivalent of
asan command line flags.
* cppbuiltin.c (define_builtin_macros_for_compilation_flags):
Add hwasan equivalent of __SANITIZE_ADDRESS__.
* doc/tm.texi: Document new hook.
* doc/tm.texi.in: Document new hook.
* flag-types.h (enum sanitize_code): New sanitizer values.
* gcc.c (STATIC_LIBHWASAN_LIBS): New macro.
(LIBHWASAN_SPEC): New macro.
(LIBHWASAN_EARLY_SPEC): New macro.
(SANITIZER_EARLY_SPEC): Update to include hwasan.
(SANITIZER_SPEC): Update to include hwasan.
(sanitize_spec_function): Use hwasan options.
* opts.c (finish_options): Describe conflicts between address
sanitizers.
(sanitizer_opts): Introduce new sanitizer flags.
(common_handle_option): Add defaults for kernel sanitizer.
* params.opt (hwasan-stack): New
(hwasan-random-frame-tag): New
(hwasan-instrument-allocas): New
(hwasan-instrument-reads): New
(hwasan-instrument-writes): New
(hwasan-memintrin): New
* target.def (HOOK_PREFIX): Add new hook.
* targhooks.c (default_memtag_can_tag_addresses): New.
* toplev.c (process_options): Ensure hwasan only on TBI
architectures.

gcc/c-family/ChangeLog:

2019-12-12  Matthew Malcomson  

* c-attribs.c (handle_no_sanitize_hwaddress_attribute): New
attribute.



### Attachment also inlined for ease of reply###


diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 
dc56e2ec62ffc7f494cc59ddf02452ac0cb406de..81f8dfc6dd3f54067d7cc0def3c0babc03bcd9c4
 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -54,6 +54,8 @@ static tree handle_cold_attribute (tree *, tree, tree, int, 
bool *);
 static tree handle_no_sanitize_attribute (tree *, tree, tree, int, bool *);
 static tree handle_no_sanitize_address_attribute (tree *, tree, tree,
  int, bool *);
+static tree handle_no_sanitize_hwaddress_attribute (tree *, tree, tree,
+   int, bool *);
 static tree handle_no_sanitize_thread_attribute (tree *, tree, tree,
 int, bool *);
 static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree,
@@ -412,6 +414,8 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_no_sanitize_attribute, NULL },
   { "no_sanitize_address",0, 0, true, false, false, false,
  handle_no_sanitize_address_attribute, NULL },
+  { "no_sanitize_hwaddress",0, 0, true, false, false, false,
+ handle_no_sanitize_hwaddress_attribute, NULL },
   { "no_sanitize_thread", 0, 0, true, false, false, false,
  handle_no_sanitize_thread_attribute, NULL },
   { "no_sanitize_undefined",  0, 0, true, false, false, false,
@@ -946,6 +950,22 @@ handle_no_sanitize_address_attribute (tree *node, tree 
name, tree, int,
   return NULL_TREE;
 }
 
+/* Handle a "no_sanitize_hwaddress" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_no_sanitize_hwaddress_attribute (tree *node, tree name, tree, int,
+ bool *no_add_attrs)
+{
+  *no_add_attrs = true;
+  if (TREE_CODE (*node) != FUNCTION_DECL)
+warning (OPT_Wattributes, "%qE attribute ignored", name);
+  else
+add_no_sanitize_value (*node, SANITIZE_HWADDRESS);
+
+  return NULL_TREE;
+}
+
 /* Handle a "no_sanitize_thread" attribute; arguments as in
struct attr

[PATCH 2/X] [libsanitizer] Only build libhwasan when targeting AArch64