Hi Richard,

I'm sending up the revised patch 5 (introducing stack variable handling)
without the other changes to other patches.

I figure there's been quite a lot of changes to this patch and I wanted
to give you time to review them while I worked on finishing the less
widespread changes in patch 6 and before I ran the more exhaustive (and
time-consuming) tests in case you didn't like the changes and those
exhaustive tests would just have to get repeated.

The big differences between this and the last version are:

- I moved all helper functions which rely on how the tag is encoded into
  hooks.  This allows backends to choose a different ABI for hwasan
  tagging.  That said, any ABI which doesn't ensure the entire the tag
  is stored as a top byte is unsupported as the library doesn't handle
  anything else.

- I no longer delay emitting RTL to initialise the hwasan_base_pointer.
  It's now emitted as soon as the pointer is used anywhere.

- No longer delay allocating stack variables for hwasan to work.
  Instead we record stack variables when allocating through
  expand_stack_var_1 *and* when allocating through expand_stack_vars.

- Use `frame_offset` in more places to avoid having to manually handle
  the case of `!FRAME_GROWS_DOWNWARDS`.



------------------------------------------------------------------------

Handling stack variables has three features.

1) Ensure HWASAN required alignment for stack variables

When tagging shadow memory, we need to ensure that each tag granule is
only used by one variable at a time.

This is done by ensuring that each tagged variable is aligned to the tag
granule representation size and also ensure that the end of each
object is aligned to ensure the start of any other data stored on the
stack is in a different granule.

This patch ensures the above by adding alignment requirements in
`align_local_variable` and forcing the stack pointer to be aligned before
allocating any stack objects.

2) Put tags into each stack variable pointer

Make sure that every pointer to a stack variable includes a tag of some
sort on it.

The way tagging works is:
  1) For every new stack frame, a random tag is generated.
  2) A base register is formed from the stack pointer value and this
     random tag.
  3) References to stack variables are now formed with RTL describing an
     offset from this base in both tag and value.

The random tag generation is handled by a backend hook.  This hook
decides whether to introduce a random tag or use the stack background
based on the parameter hwasan-random-frame-tag.  Using the stack
background is necessary for testing and bootstrap.  It is necessary
during bootstrap to avoid breaking the `configure` test program for
determining stack direction.

Using the stack background means that every stack frame has the initial
tag of zero and variables are tagged with incrementing tags from 1,
which also makes debugging a bit easier.

The tag&value offsets are also handled by a backend hook.

This patch also adds some macros defining how the HWASAN shadow memory
is stored and how a tag is stored in a pointer.

3) For each stack variable, tag and untag the shadow stack on function
   prologue and epilogue.

On entry to each function we tag the relevant shadow stack region for
each stack variable the tag to match the tag added to each pointer for
that variable.

This is the first patch where we use the HWASAN shadow space, so we need
to add in the libhwasan initialisation code that creates this shadow
memory region into the binary we produce.  This instrumentation is done
in `compile_file`.

When exiting a function we need to ensure the shadow stack for this
function has no remaining tag.  Without clearing the shadow stack area
for this stack frame, later function calls could get false positives
when those later function calls check untagged areas (such as parameters
passed on the stack) against a shadow stack area with left-over tag.

Hence we ensure that the entire stack frame is cleared on function exit.


ChangeLog:

        * config/bootstrap-hwasan.mk: Disable random frame tags for
        stack-tagging during bootstrap.
        * gcc/asan.c (struct hwasan_stack_var): New.
        (hwasan_sanitize_p): New.
        (hwasan_sanitize_stack_p): New.
        (hwasan_sanitize_allocas_p): New.
        (initialize_sanitizer_builtins): Define new builtins.
        (ATTR_NOTHROW_LIST): New macro.
        (hwasan_current_frame_tag): New.
        (hwasan_frame_base): New.
        (hwasan_record_stack_var): New.
        (hwasan_get_frame_extent): New.
        (hwasan_increment_frame_tag): New.
        (hwasan_record_frame_init): New.
        (hwasan_emit_prologue): New.
        (hwasan_emit_untag_frame): New.
        (hwasan_finish_file): New.
        (hwasan_truncate_to_tag_size): New.
        * gcc/asan.h (hwasan_record_frame_init): New declaration.
        (hwasan_record_stack_var): New declaration.
        (hwasan_emit_prologue): New declaration.
        (hwasan_emit_untag_frame): New declaration.
        (hwasan_get_frame_extent): New declaration.
        (hwasan_frame_base): New declaration.
        (hwasan_current_frame_tag): New declaration.
        (hwasan_increment_frame_tag): New declaration.
        (hwasan_truncate_to_tag_size): New declaration.
        (hwasan_finish_file): New declaration.
        (hwasan_sanitize_p): New declaration.
        (hwasan_sanitize_stack_p): New declaration.
        (hwasan_sanitize_allocas_p): New declaration.
        (HWASAN_TAG_SIZE): New macro.
        (HWASAN_TAG_GRANULE_SIZE): New macro.
        (HWASAN_STACK_BACKGROUND): New macro.
        * gcc/builtin-types.def (BT_FN_VOID_PTR_UINT8_PTRMODE): New.
        * gcc/builtins.def (DEF_SANITIZER_BUILTIN): Enable for HWASAN.
        * gcc/cfgexpand.c (align_local_variable): When using hwasan ensure
        alignment to tag granule.
        (align_frame_offset): New.
        (expand_one_stack_var_at): For hwasan use tag offset.
        (expand_stack_vars): Record stack objects for hwasan.
        (expand_one_stack_var_1): Record stack objects for hwasan.
        (init_vars_expansion): Initialise hwasan state.
        (expand_used_vars): Emit hwasan prologue and generate hwasan epilogue.
        * gcc/doc/tm.texi (TARGET_MEMTAG_TAG_SIZE,TARGET_MEMTAG_GRANULE_SIZE,
        TARGET_MEMTAG_INSERT_RANDOM_TAG,TARGET_MEMTAG_ADD_TAG,
        TARGET_MEMTAG_SET_TAG,TARGET_MEMTAG_EXTRACT_TAG,
        TARGET_MEMTAG_UNTAGGED_POINTER): Document new hooks.
        * gcc/doc/tm.texi.in (TARGET_MEMTAG_TAG_SIZE,TARGET_MEMTAG_GRANULE_SIZE,
        TARGET_MEMTAG_INSERT_RANDOM_TAG,TARGET_MEMTAG_ADD_TAG,
        TARGET_MEMTAG_SET_TAG,TARGET_MEMTAG_EXTRACT_TAG,
        TARGET_MEMTAG_UNTAGGED_POINTER): Document new hooks.
        * gcc/explow.c (get_dynamic_stack_base): Take new `base` argument.
        * gcc/explow.h (get_dynamic_stack_base): Take new `base` argument.
        * gcc/sanitizer.def (BUILT_IN_HWASAN_INIT): New.
        (BUILT_IN_HWASAN_TAG_MEM): New.
        * gcc/target.def (target_memtag_tag_size,target_memtag_granule_size,
        target_memtag_insert_random_tag,target_memtag_add_tag,
        target_memtag_set_tag,target_memtag_extract_tag,
        target_memtag_untagged_pointer): New hooks.
        * gcc/targhooks.c (HWASAN_SHIFT): New.
        (HWASAN_SHIFT_RTX): New.
        (default_memtag_tag_size): New default hook.
        (default_memtag_granule_size): New default hook.
        (default_memtag_insert_random_tag): New default hook.
        (default_memtag_add_tag): New default hook.
        (default_memtag_set_tag): New default hook.
        (default_memtag_extract_tag): New default hook.
        (default_memtag_untagged_pointer): New default hook.
        * gcc/targhooks.h (default_memtag_tag_size): New default hook.
        (default_memtag_granule_size): New default hook.
        (default_memtag_insert_random_tag): New default hook.
        (default_memtag_add_tag): New default hook.
        (default_memtag_set_tag): New default hook.
        (default_memtag_extract_tag): New default hook.
        (default_memtag_untagged_pointer): New default hook.
        * gcc/toplev.c (compile_file): Call hwasan_finish_file when finished.


###############     Attachment also inlined for ease of reply    ###############


diff --git a/config/bootstrap-hwasan.mk b/config/bootstrap-hwasan.mk
index 
4f60bed3fd6e98b47a3a38aea6eba2a7c320da25..91989f4bb1db6ccff564383777757b896645e541
 100644
--- a/config/bootstrap-hwasan.mk
+++ b/config/bootstrap-hwasan.mk
@@ -1,7 +1,11 @@
 # This option enables -fsanitize=hwaddress for stage2 and stage3.
+# We need to disable random frame tags for bootstrap since the autoconf check
+# for which direction the stack is growing has UB that a random frame tag
+# breaks.  Running with a random frame tag gives approx. 50% chance of
+# bootstrap comparison diff in libiberty/alloca.c.
 
-STAGE2_CFLAGS += -fsanitize=hwaddress
-STAGE3_CFLAGS += -fsanitize=hwaddress
+STAGE2_CFLAGS += -fsanitize=hwaddress --param hwasan-random-frame-tag=0
+STAGE3_CFLAGS += -fsanitize=hwaddress --param hwasan-random-frame-tag=0
 POSTSTAGE1_LDFLAGS += -fsanitize=hwaddress -static-libhwasan \
                      -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/ \
                      -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/ \
diff --git a/gcc/asan.h b/gcc/asan.h
index 
9efd33f9b86babbc10c4553c31b86950a313a242..918ee63000714247c2e1b0d810f56a2330dcf46f
 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -34,6 +34,20 @@ extern bool asan_expand_mark_ifn (gimple_stmt_iterator *);
 extern bool asan_expand_poison_ifn (gimple_stmt_iterator *, bool *,
                                    hash_map<tree, tree> &);
 
+extern void hwasan_record_frame_init ();
+extern void hwasan_record_stack_var (rtx, rtx, poly_int64, poly_int64);
+extern void hwasan_emit_prologue ();
+extern rtx_insn *hwasan_emit_untag_frame (rtx, rtx);
+extern rtx hwasan_get_frame_extent ();
+extern rtx hwasan_frame_base ();
+extern uint8_t hwasan_current_frame_tag ();
+extern void hwasan_increment_frame_tag ();
+extern rtx hwasan_truncate_to_tag_size (rtx, rtx);
+extern void hwasan_finish_file (void);
+extern bool hwasan_sanitize_p (void);
+extern bool hwasan_sanitize_stack_p (void);
+extern bool hwasan_sanitize_allocas_p (void);
+
 extern gimple_stmt_iterator create_cond_insert_point
      (gimple_stmt_iterator *, bool, bool, bool, basic_block *, basic_block *);
 
@@ -75,6 +89,26 @@ extern hash_set <tree> *asan_used_labels;
 
 #define ASAN_USE_AFTER_SCOPE_ATTRIBUTE "use after scope memory"
 
+/* NOTE: The values below and the hooks under targetm.memtag define an ABI and
+   are hard-coded to these values in libhwasan, hence they can't be changed
+   independently here.  */
+/* How many bits are used to store a tag in a pointer.
+   The default version uses the entire top byte of a pointer (i.e. 8 bits).  */
+#define HWASAN_TAG_SIZE targetm.memtag.tag_size ()
+/* Tag Granule of HWASAN shadow stack.
+   This is the size in real memory that each byte in the shadow memory refers
+   to.  I.e. if a variable is X bytes long in memory then it's tag in shadow
+   memory will span X / HWASAN_TAG_GRANULE_SIZE bytes.
+   Most variables will need to be aligned to this amount since two variables
+   that are neighbours in memory and share a tag granule would need to share
+   the same tag (the shared tag granule can only store one tag).  */
+#define HWASAN_TAG_GRANULE_SIZE targetm.memtag.granule_size ()
+/* Define the tag for the stack background.
+   This defines what tag the stack pointer will be and hence what tag all
+   variables that are not given special tags are (e.g. spilled registers,
+   and parameters passed on the stack).  */
+#define HWASAN_STACK_BACKGROUND gen_int_mode (0, QImode)
+
 /* Various flags for Asan builtins.  */
 enum asan_check_flags
 {
diff --git a/gcc/asan.c b/gcc/asan.c
index 
9c9aa4cae35832c1534a2cffac1d3d13eed0e687..80dd4f0b6f92031b68c85bfea76c7012f7b5206c
 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -257,6 +257,47 @@ hash_set<tree> *asan_handled_variables = NULL;
 
 hash_set <tree> *asan_used_labels = NULL;
 
+/* Global variables for HWASAN stack tagging.  */
+/* hwasan_frame_tag_offset records the offset from the frame base tag that the
+   next object should have.  */
+static uint8_t hwasan_frame_tag_offset = 0;
+/* hwasan_frame_base_ptr is a pointer with the same address as
+   `virtual_stack_vars_rtx` for the current frame, and with the frame base tag
+   stored in it.  N.b. this global RTX does not need to be marked GTY, but is
+   done so anyway.  The need is not there since all uses are in just one pass
+   (cfgexpand) and there are no calls to ggc_collect between the uses.  We mark
+   it GTY(()) anyway to allow the use of the variable later on if needed by
+   future features.  */
+static GTY(()) rtx hwasan_frame_base_ptr = NULL_RTX;
+
+/* Structure defining the extent of one object on the stack that HWASAN needs
+   to tag in the corresponding shadow stack space.
+
+   The range this object spans on the stack is between `untagged_base +
+   nearest_offset` and `untagged_base + farthest_offset`.
+   `tagged_base` is an rtx containing the same value as `untagged_base` but
+   with a random tag stored in the top byte.  We record both `untagged_base`
+   and `tagged_base` so that `hwasan_emit_prologue` can use both without having
+   to emit RTL into the instruction stream to re-calculate one from the other.
+   (`hwasan_emit_prologue` needs to use both bases since the
+   __hwasan_tag_memory call it emits uses an untagged value, and it calculates
+   the tag to store in shadow memory based on the tag_offset plus the tag in
+   tagged_base).  */
+struct hwasan_stack_var
+{
+  rtx untagged_base;
+  rtx tagged_base;
+  poly_int64 nearest_offset;
+  poly_int64 farthest_offset;
+  uint8_t tag_offset;
+};
+
+/* Variable recording all stack variables that HWASAN needs to tag.
+   Does not need to be marked as GTY(()) since every use is in the cfgexpand
+   pass and gcc_collect is not called in the middle of that pass.  */
+static vec<hwasan_stack_var *> hwasan_tagged_stack_vars;
+
+
 /* Sets shadow offset to value in string VAL.  */
 
 bool
@@ -1352,6 +1393,28 @@ asan_redzone_buffer::flush_if_full (void)
     flush_redzone_payload ();
 }
 
+/* Returns whether we are tagging pointers and checking those tags on memory
+   access.  */
+bool
+hwasan_sanitize_p ()
+{
+  return sanitize_flags_p (SANITIZE_HWADDRESS);
+}
+
+/* Are we tagging the stack?  */
+bool
+hwasan_sanitize_stack_p ()
+{
+  return (hwasan_sanitize_p () && param_hwasan_instrument_stack);
+}
+
+/* Are we protecting alloca objects?  */
+bool
+hwasan_sanitize_allocas_p (void)
+{
+  return (hwasan_sanitize_stack_p () && param_hwasan_protect_allocas);
+}
+
 /* Insert code to protect stack vars.  The prologue sequence should be emitted
    directly, epilogue sequence returned.  BASE is the register holding the
    stack base, against which OFFSETS array offsets are relative to, OFFSETS
@@ -2901,6 +2964,11 @@ initialize_sanitizer_builtins (void)
     = build_function_type_list (void_type_node, uint64_type_node,
                                ptr_type_node, NULL_TREE);
 
+  tree BT_FN_VOID_PTR_UINT8_PTRMODE
+    = build_function_type_list (void_type_node, ptr_type_node,
+                               unsigned_char_type_node,
+                               pointer_sized_int_node, NULL_TREE);
+
   tree BT_FN_BOOL_VPTR_PTR_IX_INT_INT[5];
   tree BT_FN_IX_CONST_VPTR_INT[5];
   tree BT_FN_IX_VPTR_IX_INT[5];
@@ -2951,6 +3019,8 @@ initialize_sanitizer_builtins (void)
 #define BT_FN_I16_CONST_VPTR_INT BT_FN_IX_CONST_VPTR_INT[4]
 #define BT_FN_I16_VPTR_I16_INT BT_FN_IX_VPTR_IX_INT[4]
 #define BT_FN_VOID_VPTR_I16_INT BT_FN_VOID_VPTR_IX_INT[4]
+#undef ATTR_NOTHROW_LIST
+#define ATTR_NOTHROW_LIST ECF_NOTHROW
 #undef ATTR_NOTHROW_LEAF_LIST
 #define ATTR_NOTHROW_LEAF_LIST ECF_NOTHROW | ECF_LEAF
 #undef ATTR_TMPURE_NOTHROW_LEAF_LIST
@@ -3702,4 +3772,330 @@ make_pass_asan_O0 (gcc::context *ctxt)
   return new pass_asan_O0 (ctxt);
 }
 
+/* For stack tagging:
+
+   Return the offset from the frame base tag that the "next" expanded object
+   should have.  */
+uint8_t
+hwasan_current_frame_tag ()
+{
+  return hwasan_frame_tag_offset;
+}
+
+/* For stack tagging:
+
+   Return the 'base pointer' for this function.  If that base pointer has not
+   yet been created then we create a register to hold it and initialise that
+   value with a possibly random tag and the value of the
+   virtual_stack_vars_rtx.  */
+rtx
+hwasan_frame_base ()
+{
+  if (! hwasan_frame_base_ptr)
+    {
+      hwasan_frame_base_ptr
+       = targetm.memtag.insert_random_tag (virtual_stack_vars_rtx);
+    }
+
+  return hwasan_frame_base_ptr;
+}
+
+/* Record a compile-time constant size stack variable that HWASAN will need to
+   tag.  This record of the range of a stack variable will be used by
+   `hwasan_emit_prologue` to emit the RTL at the start of each frame which will
+   set tags in the shadow memory according to the assigned tag for each object.
+
+   The range that the object spans in stack space should be described by the
+   bounds `untagged_base + nearest` and `untagged_base + farthest`.
+   `tagged_base` is the base address which contains the "base frame tag" for
+   this frame, and from which the value to address this object with will be
+   calculated.
+
+   We record the `untagged_base` since the functions in the hwasan library we
+   use to tag memory take pointers without a tag.  */
+void
+hwasan_record_stack_var (rtx untagged_base, rtx tagged_base,
+                        poly_int64 nearest, poly_int64 farthest)
+{
+  hwasan_stack_var *cur_var = new hwasan_stack_var;
+  cur_var->untagged_base = untagged_base;
+  cur_var->tagged_base = tagged_base;
+  cur_var->nearest_offset = nearest;
+  cur_var->farthest_offset = farthest;
+  cur_var->tag_offset = hwasan_current_frame_tag ();
+
+  hwasan_tagged_stack_vars.safe_push (cur_var);
+}
+
+/* Return the RTX representing the farthest extent of the statically allocated
+   stack objects for this frame.  If hwasan_frame_base_ptr has not been
+   initialised then we are not storing any static variables on the stack in
+   this frame.  In this case we return NULL_RTX to represent that.
+
+   Otherwise simply return virtual_stack_vars_rtx + frame_offset.  */
+rtx
+hwasan_get_frame_extent ()
+{
+  return hwasan_frame_base_ptr ?
+    plus_constant (Pmode, virtual_stack_vars_rtx, frame_offset)
+    : NULL_RTX;
+}
+
+
+/* For stack tagging:
+
+   Increment the tag offset modulo the size a tag can represent.  */
+void
+hwasan_increment_frame_tag ()
+{
+  uint8_t tag_bits = HWASAN_TAG_SIZE;
+  gcc_assert (HWASAN_TAG_SIZE
+             <= sizeof (hwasan_frame_tag_offset) * CHAR_BIT);
+  hwasan_frame_tag_offset = (hwasan_frame_tag_offset + 1) % (1 << tag_bits);
+  /* The "background tag" of the stack is zero by definition.
+     This is the tag that objects like parameters passed on the stack and
+     spilled registers are given.  It is handy to avoid this tag for objects
+     whose tags we decide ourselves, partly to ensure that buffer overruns
+     can't affect these important variables (e.g. saved link register, saved
+     stack pointer etc) and partly to make debugging easier (everything with a
+     tag of zero is space allocated automatically by the compiler).
+
+     This is not feasible when using random frame tags (the default
+     configuration for hwasan) since the tag for the given frame is randomly
+     chosen at runtime.  In order to avoid any tags matching the stack
+     background we would need to decide tag offsets at runtime instead of
+     compile time (and pay the resulting performance cost).
+
+     When not using random base tags for each frame (i.e. when compiled with
+     `--param hwasan-random-frame-tag=0`) the base tag for each frame is zero.
+     This means the tag that each object gets is equal to the
+     hwasan_frame_tag_offset used in determining it.
+     When this is the case we *can* ensure no object gets the tag of zero by
+     simply ensuring no object has the hwasan_frame_tag_offset of zero.
+
+     There is the extra complication that we only record the
+     hwasan_frame_tag_offset here (which is the offset from the tag stored in
+     the stack pointer).  In the kernel, the tag in the stack pointer is 0xff
+     rather than zero.  This does not cause problems since tags of 0xff are
+     never checked in the kernel.  As mentioned at the beginning of this
+     comment the background tag of the stack is zero by definition, which means
+     that for the kernel we should skip offsets of both 0 and 1 from the stack
+     pointer.  Avoiding the offset of 0 ensures we use a tag which will be
+     checked, avoiding the offset of 1 ensures we use a tag that is not the
+     same as the background.  */
+  if (hwasan_frame_tag_offset == 0 && ! param_hwasan_random_frame_tag)
+    hwasan_frame_tag_offset += 1;
+  if (hwasan_frame_tag_offset == 1 && ! param_hwasan_random_frame_tag
+      && sanitize_flags_p (SANITIZE_KERNEL_HWADDRESS))
+    hwasan_frame_tag_offset += 1;
+}
+
+/* Clear internal state for the next function.
+   This function is called before variables on the stack get expanded, in
+   `init_vars_expansion`.  */
+void
+hwasan_record_frame_init ()
+{
+  delete asan_used_labels;
+  asan_used_labels = NULL;
+
+  /* If this isn't the case then some stack variable was recorded *before*
+     hwasan_record_frame_init is called, yet *after* the hwasan prologue for
+     the previous frame was emitted.  Such stack variables would not have
+     their shadow stack filled in.  */
+  gcc_assert (hwasan_tagged_stack_vars.is_empty ());
+  hwasan_frame_base_ptr = NULL_RTX;
+
+  /* When not using a random frame tag we can avoid the background stack
+     color which gives the user a little better debug output upon a crash.
+     Meanwhile, when using a random frame tag it will be nice to avoid adding
+     tags for the first object since that is unnecessary extra work.
+     Hence set the initial hwasan_frame_tag_offset to be 0 if using a random
+     frame tag and 1 otherwise.
+
+     As described in hwasan_increment_frame_tag, in the kernel the stack
+     pointer has the tag 0xff.  That means that to avoid 0xff and 0 (the tag
+     which the kernel does not check and the background tag respectively) we
+     start with a tag offset of 2.  */
+  hwasan_frame_tag_offset = param_hwasan_random_frame_tag
+    ? 0
+    : sanitize_flags_p (SANITIZE_KERNEL_HWADDRESS) ? 2 : 1;
+}
+
+/* For stack tagging:
+   (Emits HWASAN equivalent of what is emitted by
+   `asan_emit_stack_protection`).
+
+   Emits the extra prologue code to set the shadow stack as required for HWASAN
+   stack instrumentation.
+
+   BASES is an array containing the tagged base registers for each object.
+   We map each object to a given base since large aligned objects have a
+   different base to others and we need to know which objects use which base.
+
+   UNTAGGED_BASES contains the same information as above except without tags.
+   This is needed since libhwasan only accepts untagged pointers in
+   __hwasan_tag_memory.
+
+   OFFSETS is an array with the start and end offsets for each object stored on
+   the stack in this frame.  This array is hence twice the length of the other
+   array arguments (given it has two entries for each stack object).
+
+   TAGS is an array containing the tag *offset* each object should have from
+   the tag in its base pointer.
+
+   LENGTH contains the length of the OFFSETS array.  */
+void
+hwasan_emit_prologue ()
+{
+  /* We need untagged base pointers since libhwasan only accepts untagged
+    pointers in __hwasan_tag_memory.  We need the tagged base pointer to obtain
+    the base tag for an offset.  */
+
+  gcc_assert ((hwasan_frame_base_ptr == NULL_RTX)
+             == hwasan_tagged_stack_vars.is_empty ());
+  if (! hwasan_frame_base_ptr)
+    return;
+
+  size_t length = hwasan_tagged_stack_vars.length ();
+  hwasan_stack_var **vars = hwasan_tagged_stack_vars.address ();
+
+  poly_int64 bot = 0, top = 0;
+  size_t i = 0;
+  for (i = 0; i < length; i++)
+    {
+      hwasan_stack_var *cur = vars[i];
+      poly_int64 nearest = cur->nearest_offset;
+      poly_int64 farthest = cur->farthest_offset;
+
+      if (known_ge (nearest, farthest))
+       {
+         top = nearest;
+         bot = farthest;
+       }
+      else
+       {
+         /* Given how these values are calculated, one must be known greater
+            than the other.  */
+         gcc_assert (known_le (nearest, farthest));
+         top = farthest;
+         bot = nearest;
+       }
+      poly_int64 size = (top - bot);
+
+      /* Assert the edge of each variable is aligned to the HWASAN tag granule
+        size.  */
+      gcc_assert (multiple_p (top, HWASAN_TAG_GRANULE_SIZE));
+      gcc_assert (multiple_p (bot, HWASAN_TAG_GRANULE_SIZE));
+      gcc_assert (multiple_p (size, HWASAN_TAG_GRANULE_SIZE));
+
+      rtx ret = init_one_libfunc ("__hwasan_tag_memory");
+      rtx base_tag = targetm.memtag.extract_tag (cur->tagged_base, NULL_RTX);
+      rtx tag = plus_constant (QImode, base_tag, cur->tag_offset);
+      tag = hwasan_truncate_to_tag_size (tag, NULL_RTX);
+
+      rtx bottom = convert_memory_address (ptr_mode,
+                                          plus_constant (Pmode,
+                                                         cur->untagged_base,
+                                                         bot));
+      emit_library_call (ret, LCT_NORMAL, VOIDmode,
+                        bottom, ptr_mode,
+                        tag, QImode,
+                        gen_int_mode (size, ptr_mode), ptr_mode);
+      delete cur;
+    }
+  /* Clear the stack vars, we've emitted the prologue for them all now.  */
+  hwasan_tagged_stack_vars.truncate (0);
+}
+
+/* For stack tagging:
+
+   Return RTL insns to clear the tags between DYNAMIC and VARS pointers
+   into the stack.  These instructions should be emitted at the end of
+   every function.
+
+   If `dynamic` is NULL_RTX then no insns are returned.  */
+rtx_insn *
+hwasan_emit_untag_frame (rtx dynamic, rtx vars)
+{
+  if (! dynamic)
+    return NULL;
+
+  start_sequence ();
+
+  dynamic = convert_memory_address (ptr_mode, dynamic);
+  vars = convert_memory_address (ptr_mode, vars);
+
+  rtx top_rtx;
+  rtx bot_rtx;
+  if (FRAME_GROWS_DOWNWARD)
+    {
+      top_rtx = vars;
+      bot_rtx = dynamic;
+    }
+  else
+    {
+      top_rtx = dynamic;
+      bot_rtx = vars;
+    }
+
+  rtx size_rtx = expand_simple_binop (ptr_mode, MINUS, top_rtx, bot_rtx,
+                                     NULL_RTX, /* unsignedp = */0,
+                                     OPTAB_DIRECT);
+
+  rtx ret = init_one_libfunc ("__hwasan_tag_memory");
+  emit_library_call (ret, LCT_NORMAL, VOIDmode,
+                    bot_rtx, ptr_mode,
+                    HWASAN_STACK_BACKGROUND, QImode,
+                    size_rtx, ptr_mode);
+
+  do_pending_stack_adjust ();
+  rtx_insn *insns = get_insns ();
+  end_sequence ();
+  return insns;
+}
+
+/* Needs to be GTY(()), because cgraph_build_static_cdtor may
+   invoke ggc_collect.  */
+static GTY(()) tree hwasan_ctor_statements;
+
+/* Insert module initialization into this TU.  This initialization calls the
+   initialization code for libhwasan.  */
+void
+hwasan_finish_file (void)
+{
+  /* Do not emit constructor initialization for the kernel.
+     (the kernel has its own initialization already).  */
+  if (flag_sanitize & SANITIZE_KERNEL_HWADDRESS)
+    return;
+
+  /* Avoid instrumenting code in the hwasan constructors/destructors.  */
+  flag_sanitize &= ~SANITIZE_HWADDRESS;
+  int priority = MAX_RESERVED_INIT_PRIORITY - 1;
+  tree fn = builtin_decl_implicit (BUILT_IN_HWASAN_INIT);
+  append_to_statement_list (build_call_expr (fn, 0), &hwasan_ctor_statements);
+  cgraph_build_static_cdtor ('I', hwasan_ctor_statements, priority);
+  flag_sanitize |= SANITIZE_HWADDRESS;
+}
+
+/* For stack tagging:
+
+   Truncate `tag` to the number of bits that a tag uses (i.e. to
+   HWASAN_TAG_SIZE).  Store the result in `target` if it's convenient.  */
+rtx
+hwasan_truncate_to_tag_size (rtx tag, rtx target)
+{
+  gcc_assert (GET_MODE (tag) == QImode);
+  if (HWASAN_TAG_SIZE != GET_MODE_PRECISION (QImode))
+    {
+      gcc_assert (GET_MODE_PRECISION (QImode) > HWASAN_TAG_SIZE);
+      tag = expand_simple_binop (QImode, AND, tag,
+                                gen_int_mode (-HWASAN_TAG_SIZE, QImode),
+                                target,
+                                /* unsignedp = */1, OPTAB_WIDEN);
+      gcc_assert (tag);
+    }
+  return tag;
+}
+
 #include "gt-asan.h"
diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 
c46b1bc5cbd1fba03b033b8d44ba186570780c3f..535366d104b9cec02ed2d07682476eebdbbe9161
 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -637,6 +637,8 @@ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_SIZE_SIZE_PTR, BT_VOID, 
BT_SIZE, BT_SIZE,
 DEF_FUNCTION_TYPE_3 (BT_FN_UINT_UINT_PTR_PTR, BT_UINT, BT_UINT, BT_PTR, BT_PTR)
 DEF_FUNCTION_TYPE_3 (BT_FN_PTR_PTR_CONST_SIZE_BOOL,
                     BT_PTR, BT_PTR, BT_CONST_SIZE, BT_BOOL)
+DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_UINT8_PTRMODE, BT_VOID, BT_PTR, BT_UINT8,
+                    BT_PTRMODE)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR,
                     BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR)
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 
ee67ac15d5cf98797144b9d08a75f4cf7ee5ad33..92121fb898bbf7a90aa0e43c65ff3fe7b20d7c99
 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -250,6 +250,7 @@ along with GCC; see the file COPYING3.  If not see
   DEF_BUILTIN (ENUM, "__builtin_" NAME, BUILT_IN_NORMAL, TYPE, TYPE,    \
               true, true, true, ATTRS, true, \
              (flag_sanitize & (SANITIZE_ADDRESS | SANITIZE_THREAD \
+                               | SANITIZE_HWADDRESS \
                                | SANITIZE_UNDEFINED \
                                | SANITIZE_UNDEFINED_NONDEFAULT) \
               || flag_sanitize_coverage))
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 
b270a4ddb9db469ba52e42f36a1bc2f02d8f03fc..52a0dd6dfda1868622b88821a6d4abc5f2c7ce03
 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -365,19 +365,22 @@ align_local_variable (tree decl, bool really_expand)
 {
   unsigned int align;
 
-  if (TREE_CODE (decl) == SSA_NAME)
-    align = TYPE_ALIGN (TREE_TYPE (decl));
-  else
-    {
-      align = LOCAL_DECL_ALIGNMENT (decl);
-      /* Don't change DECL_ALIGN when called from estimated_stack_frame_size.
-        That is done before IPA and could bump alignment based on host
-        backend even for offloaded code which wants different
-        LOCAL_DECL_ALIGNMENT.  */
-      if (really_expand)
-       SET_DECL_ALIGN (decl, align);
-    }
-  return align / BITS_PER_UNIT;
+  align = (TREE_CODE (decl) == SSA_NAME)
+    ? TYPE_ALIGN (TREE_TYPE (decl)) : LOCAL_DECL_ALIGNMENT (decl);
+
+  if (hwasan_sanitize_stack_p ())
+    align = MAX (align, ((unsigned)HWASAN_TAG_GRANULE_SIZE) * BITS_PER_UNIT);
+
+  if (TREE_CODE (decl) != SSA_NAME && really_expand)
+    /* Don't change DECL_ALIGN when called from estimated_stack_frame_size.
+       That is done before IPA and could bump alignment based on host
+       backend even for offloaded code which wants different
+       LOCAL_DECL_ALIGNMENT.  */
+    SET_DECL_ALIGN (decl, align);
+
+  unsigned int ret_align = align / BITS_PER_UNIT;
+
+  return ret_align;
 }
 
 /* Align given offset BASE with ALIGN.  Truncate up if ALIGN_UP is true,
@@ -420,6 +423,14 @@ alloc_stack_frame_space (poly_int64 size, unsigned 
HOST_WIDE_INT align)
   return offset;
 }
 
+/* Ensure that the stack is aligned to ALIGN bytes.
+   Return the new frame offset.  */
+static poly_int64
+align_frame_offset (unsigned HOST_WIDE_INT align)
+{
+  return alloc_stack_frame_space (0, align);
+}
+
 /* Accumulate DECL into STACK_VARS.  */
 
 static void
@@ -996,7 +1007,12 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned 
base_align,
   /* If this fails, we've overflowed the stack frame.  Error nicely?  */
   gcc_assert (known_eq (offset, trunc_int_for_mode (offset, Pmode)));
 
-  x = plus_constant (Pmode, base, offset);
+  if (hwasan_sanitize_stack_p ())
+    x = targetm.memtag.add_tag (base, offset,
+                               hwasan_current_frame_tag ());
+  else
+    x = plus_constant (Pmode, base, offset);
+
   x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME
                   ? TYPE_MODE (TREE_TYPE (decl))
                   : DECL_MODE (SSAVAR (decl)), x);
@@ -1006,7 +1022,8 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned 
base_align,
       /* Set alignment we actually gave this decl if it isn't an SSA name.
          If it is we generate stack slots only accidentally so it isn't as
         important, we'll simply use the alignment that is already set.  */
-      if (base == virtual_stack_vars_rtx)
+      if (base == virtual_stack_vars_rtx
+         || (hwasan_sanitize_stack_p () && base == hwasan_frame_base ()))
        offset -= frame_phase;
       align = known_alignment (offset);
       align *= BITS_PER_UNIT;
@@ -1045,13 +1062,13 @@ public:
 /* A subroutine of expand_used_vars.  Give each partition representative
    a unique location within the stack frame.  Update each partition member
    with that location.  */
-
 static void
 expand_stack_vars (bool (*pred) (size_t), class stack_vars_data *data)
 {
   size_t si, i, j, n = stack_vars_num;
   poly_uint64 large_size = 0, large_alloc = 0;
   rtx large_base = NULL;
+  rtx large_untagged_base = NULL;
   unsigned large_align = 0;
   bool large_allocation_done = false;
   tree decl;
@@ -1102,7 +1119,7 @@ expand_stack_vars (bool (*pred) (size_t), class 
stack_vars_data *data)
     {
       rtx base;
       unsigned base_align, alignb;
-      poly_int64 offset;
+      poly_int64 offset = 0;
 
       i = stack_vars_sorted[si];
 
@@ -1123,10 +1140,31 @@ expand_stack_vars (bool (*pred) (size_t), class 
stack_vars_data *data)
       if (pred && !pred (i))
        continue;
 
+      base = hwasan_sanitize_stack_p ()
+       ? hwasan_frame_base ()
+       : virtual_stack_vars_rtx;
       alignb = stack_vars[i].alignb;
       if (alignb * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT)
        {
-         base = virtual_stack_vars_rtx;
+         poly_int64 hwasan_orig_offset;
+         if (hwasan_sanitize_stack_p ())
+           {
+             /* There must be no tag granule "shared" between different
+                objects.  This means that no HWASAN_TAG_GRANULE_SIZE byte
+                chunk can have more than one object in it.
+
+                We ensure this by forcing the end of the last bit of data to
+                be aligned to HWASAN_TAG_GRANULE_SIZE bytes here, and setting
+                the start of each variable to be aligned to
+                HWASAN_TAG_GRANULE_SIZE bytes in `align_local_variable`.
+
+                We can't align just one of the start or end, since there are
+                untagged things stored on the stack that we have no control on
+                the alignment and these can't share a tag granule with a
+                tagged variable.  */
+             hwasan_orig_offset = align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
+             gcc_assert (stack_vars[i].alignb >= HWASAN_TAG_GRANULE_SIZE);
+           }
          /* ASAN description strings don't yet have a syntax for expressing
             polynomial offsets.  */
          HOST_WIDE_INT prev_offset;
@@ -1137,7 +1175,7 @@ expand_stack_vars (bool (*pred) (size_t), class 
stack_vars_data *data)
            {
              if (data->asan_vec.is_empty ())
                {
-                 alloc_stack_frame_space (0, ASAN_RED_ZONE_SIZE);
+                 align_frame_offset (ASAN_RED_ZONE_SIZE);
                  prev_offset = frame_offset.to_constant ();
                }
              prev_offset = align_base (prev_offset,
@@ -1205,6 +1243,14 @@ expand_stack_vars (bool (*pred) (size_t), class 
stack_vars_data *data)
            {
              offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
              base_align = crtl->max_used_stack_slot_alignment;
+
+             if (hwasan_sanitize_stack_p ())
+               /* Use `frame_offset` here rather than offset since the
+                  frame_offset describes the extent allocated for this
+                  particular variable while `offset` describes the address
+                  that this variable starts at.  */
+               hwasan_record_stack_var (virtual_stack_vars_rtx, base,
+                                        hwasan_orig_offset, frame_offset);
            }
        }
       else
@@ -1225,14 +1271,33 @@ expand_stack_vars (bool (*pred) (size_t), class 
stack_vars_data *data)
              loffset = alloc_stack_frame_space
                (rtx_to_poly_int64 (large_allocsize),
                 PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT);
-             large_base = get_dynamic_stack_base (loffset, large_align);
+             large_base = get_dynamic_stack_base (loffset, large_align, base);
              large_allocation_done = true;
            }
-         gcc_assert (large_base != NULL);
 
+         gcc_assert (large_base != NULL);
          large_alloc = aligned_upper_bound (large_alloc, alignb);
          offset = large_alloc;
          large_alloc += stack_vars[i].size;
+         if (hwasan_sanitize_stack_p ())
+           {
+             /* An object with a large alignment requirement means that the
+                alignment requirement is greater than the required alignment
+                for tags.  */
+             if (!large_untagged_base)
+               large_untagged_base
+                 = targetm.memtag.untagged_pointer (large_base, NULL_RTX);
+             /* Ensure the end of the variable is also aligned correctly.  */
+             poly_int64 align_again
+               = aligned_upper_bound (large_alloc, HWASAN_TAG_GRANULE_SIZE);
+             /* For large allocations we always allocate a chunk of space
+                (which is addressed by large_untagged_base/large_base) and
+                then use positive offsets from that.  Hence the farthest
+                offset is `align_again` and the nearest offset from the base
+                is `offset`.  */
+             hwasan_record_stack_var (large_untagged_base, large_base,
+                                      offset, align_again);
+           }
 
          base = large_base;
          base_align = large_align;
@@ -1243,9 +1308,10 @@ expand_stack_vars (bool (*pred) (size_t), class 
stack_vars_data *data)
       for (j = i; j != EOC; j = stack_vars[j].next)
        {
          expand_one_stack_var_at (stack_vars[j].decl,
-                                  base, base_align,
-                                  offset);
+                                  base, base_align, offset);
        }
+      if (hwasan_sanitize_stack_p ())
+       hwasan_increment_frame_tag ();
     }
 
   gcc_assert (known_eq (large_alloc, large_size));
@@ -1321,27 +1387,49 @@ expand_one_stack_var_1 (tree var)
 {
   poly_uint64 size;
   poly_int64 offset;
+  poly_int64 hwasan_orig_offset;
   unsigned byte_align;
 
   if (TREE_CODE (var) == SSA_NAME)
     {
       tree type = TREE_TYPE (var);
       size = tree_to_poly_uint64 (TYPE_SIZE_UNIT (type));
-      byte_align = TYPE_ALIGN_UNIT (type);
     }
   else
-    {
-      size = tree_to_poly_uint64 (DECL_SIZE_UNIT (var));
-      byte_align = align_local_variable (var, true);
-    }
+    size = tree_to_poly_uint64 (DECL_SIZE_UNIT (var));
+  byte_align = align_local_variable (var, true);
 
   /* We handle highly aligned variables in expand_stack_vars.  */
   gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT);
 
+  if (hwasan_sanitize_stack_p ())
+    /* Allocate zero bytes to align the stack.  */
+    hwasan_orig_offset = align_frame_offset (HWASAN_TAG_GRANULE_SIZE);
   offset = alloc_stack_frame_space (size, byte_align);
 
-  expand_one_stack_var_at (var, virtual_stack_vars_rtx,
+  rtx base;
+  if (hwasan_sanitize_stack_p ())
+    {
+      base = hwasan_frame_base ();
+      /* Use `frame_offset` to automatically account for machines where the
+        frame grows upwards.
+
+        `offset` will always point to the "start" of the stack object, which
+        will be the smallest address, for ! FRAME_GROWS_DOWNWARD this is *not*
+        the "furthest" offset from the base delimiting the current stack
+        object.  `frame_offset` will always delimit the extent that the frame.
+        */
+      hwasan_record_stack_var (virtual_stack_vars_rtx, base,
+                              hwasan_orig_offset, frame_offset);
+    }
+  else
+    base = virtual_stack_vars_rtx;
+
+  expand_one_stack_var_at (var, base,
                           crtl->max_used_stack_slot_alignment, offset);
+
+  if (hwasan_sanitize_stack_p ())
+    hwasan_increment_frame_tag ();
 }
 
 /* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are
@@ -1947,6 +2035,8 @@ init_vars_expansion (void)
   /* Initialize local stack smashing state.  */
   has_protected_decls = false;
   has_short_buffer = false;
+  if (hwasan_sanitize_stack_p ())
+    hwasan_record_frame_init ();
 }
 
 /* Free up stack variable graph data.  */
@@ -2271,10 +2361,26 @@ expand_used_vars (void)
       expand_stack_vars (NULL, &data);
     }
 
+  if (hwasan_sanitize_stack_p ())
+    hwasan_emit_prologue ();
   if (asan_sanitize_allocas_p () && cfun->calls_alloca)
     var_end_seq = asan_emit_allocas_unpoison (virtual_stack_dynamic_rtx,
                                              virtual_stack_vars_rtx,
                                              var_end_seq);
+  else if (hwasan_sanitize_allocas_p () && cfun->calls_alloca)
+    /* When using out-of-line instrumentation we only want to emit one function
+       call for clearing the tags in a region of shadow stack.  When there are
+       alloca calls in this frame we want to emit a call using the
+       virtual_stack_dynamic_rtx, but when not we use the hwasan_frame_extent
+       rtx we created in expand_stack_vars.  */
+    var_end_seq = hwasan_emit_untag_frame (virtual_stack_dynamic_rtx,
+                                          virtual_stack_vars_rtx);
+  else if (hwasan_sanitize_stack_p ())
+    /* If no variables were stored on the stack, `hwasan_get_frame_extent`
+       will return NULL_RTX and hence `hwasan_emit_untag_frame` will return
+       NULL (i.e. an empty sequence).  */
+    var_end_seq = hwasan_emit_untag_frame (hwasan_get_frame_extent (),
+                                          virtual_stack_vars_rtx);
 
   fini_vars_expansion ();
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 
998d940e144c31aa7309b478b92b680736f317bf..dd47400ae4100daa8d6987f1d347857d8717bfcd
 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -2976,8 +2976,63 @@ A target hook which lets a backend compute the set of 
pressure classes to  be us
 @end deftypefn
 
 @deftypefn {Target Hook} bool TARGET_MEMTAG_CAN_TAG_ADDRESSES ()
-True if backend architecture naturally supports ignoring the top byte of
- pointers.  This feature means that @option{-fsanitize=hwaddress} can work.
+True if backend architecture naturally supports ignoring some region of
+pointers.  This feature means that @option{-fsanitize=hwaddress} can work.
+@end deftypefn
+
+@deftypefn {Target Hook} uint8_t TARGET_MEMTAG_TAG_SIZE ()
+Return the size of a tag (in bits) for this platform.
+
+The default returns 8.
+@end deftypefn
+
+@deftypefn {Target Hook} uint8_t TARGET_MEMTAG_GRANULE_SIZE ()
+Return the size in real memory that each byte in shadow memory refers to.
+I.e. if a variable is X bytes long in memory, then this hook should return
+the value Y such that the tag in shadow memory spans X/Y bytes.
+
+Most variables will need to be aligned to this amount since two variables
+that are neighbours in memory and share a tag granule would need to share
+the same tag.
+
+The default returns 16.
+@end deftypefn
+
+@deftypefn {Target Hook} rtx TARGET_MEMTAG_INSERT_RANDOM_TAG (rtx @var{base})
+Create a new register with the address value of @var{base} and a
+(possibly) random tag in it.
+This function is used to generate a tagged base for the current stack frame.
+@end deftypefn
+
+@deftypefn {Target Hook} rtx TARGET_MEMTAG_ADD_TAG (rtx @var{base}, poly_int64 
@var{addr_offset}, uint8_t @var{tag_offset})
+Return an RTX that represents the result of adding @var{addr_offset} to
+the address pointer @var{base} and @var{tag_offset} to the tag in pointer
+@var{base}.
+The resulting RTX must either be a valid memory address or be able to get
+put into an operand with force_operand.
+
+Unlike other memtag hooks, this must return an expression and not emit any
+RTL.
+@end deftypefn
+
+@deftypefn {Target Hook} rtx TARGET_MEMTAG_SET_TAG (rtx @var{untagged_base}, 
rtx @var{tag}, rtx @var{target})
+Return an RTX representing @var{untagged_base} but with the tag @var{tag}.
+Try and store this in @var{target} if convenient.
+@var{untagged_base} is required to have a zero tag when this hook is called.
+The default of this hook is to set the top byte of @var{untagged_base} to
+@var{tag}.
+@end deftypefn
+
+@deftypefn {Target Hook} rtx TARGET_MEMTAG_EXTRACT_TAG (rtx 
@var{tagged_pointer}, rtx @var{target})
+Return an RTX representing the tag stored in @var{tagged_pointer}.
+Store the result in @var{target} if it is convenient.
+The default represents the top byte of the original pointer.
+@end deftypefn
+
+@deftypefn {Target Hook} rtx TARGET_MEMTAG_UNTAGGED_POINTER (rtx 
@var{tagged_pointer}, rtx @var{target})
+Return an RTX representing @var{tagged_pointer} with its tag set to zero.
+Store the result in @var{target} if convenient.
+The default clears the top byte of the original pointer.
 @end deftypefn
 
 @node Stack and Calling
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 
80960c1fe041bd08ccc22a4c41ebf740eca80015..98e3c2b65581f62b7ebdcefa51723a125507e895
 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -2377,6 +2377,20 @@ in the reload pass.
 
 @hook TARGET_MEMTAG_CAN_TAG_ADDRESSES
 
+@hook TARGET_MEMTAG_TAG_SIZE
+
+@hook TARGET_MEMTAG_GRANULE_SIZE
+
+@hook TARGET_MEMTAG_INSERT_RANDOM_TAG
+
+@hook TARGET_MEMTAG_ADD_TAG
+
+@hook TARGET_MEMTAG_SET_TAG
+
+@hook TARGET_MEMTAG_EXTRACT_TAG
+
+@hook TARGET_MEMTAG_UNTAGGED_POINTER
+
 @node Stack and Calling
 @section Stack Layout and Calling Conventions
 @cindex calling conventions
diff --git a/gcc/explow.h b/gcc/explow.h
index 
0df8c62b82a8bf1d8d6baf0b6fb658e66361a407..581831cb19fdf9e8fd969bb30139e1358279a34d
 100644
--- a/gcc/explow.h
+++ b/gcc/explow.h
@@ -106,7 +106,7 @@ extern rtx allocate_dynamic_stack_space (rtx, unsigned, 
unsigned,
 extern void get_dynamic_stack_size (rtx *, unsigned, unsigned, HOST_WIDE_INT 
*);
 
 /* Returns the address of the dynamic stack space without allocating it.  */
-extern rtx get_dynamic_stack_base (poly_int64, unsigned);
+extern rtx get_dynamic_stack_base (poly_int64, unsigned, rtx);
 
 /* Return an rtx doing runtime alignment to REQUIRED_ALIGN on TARGET.  */
 extern rtx align_dynamic_address (rtx, unsigned);
diff --git a/gcc/explow.c b/gcc/explow.c
index 
0fbc6d25b816457a3d13ed45d16b5dd0513cfacd..41c3f6ace49c0e55c080e10b917842b1b21d49eb
 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -1583,10 +1583,14 @@ allocate_dynamic_stack_space (rtx size, unsigned 
size_align,
    OFFSET is the offset of the area into the virtual stack vars area.
 
    REQUIRED_ALIGN is the alignment (in bits) required for the region
-   of memory.  */
+   of memory.
+
+   BASE is the rtx of the base of this virtual stack vars area.
+   The only time this is not `virtual_stack_vars_rtx` is when tagging pointers
+   on the stack.  */
 
 rtx
-get_dynamic_stack_base (poly_int64 offset, unsigned required_align)
+get_dynamic_stack_base (poly_int64 offset, unsigned required_align, rtx base)
 {
   rtx target;
 
@@ -1594,7 +1598,7 @@ get_dynamic_stack_base (poly_int64 offset, unsigned 
required_align)
     crtl->preferred_stack_boundary = PREFERRED_STACK_BOUNDARY;
 
   target = gen_reg_rtx (Pmode);
-  emit_move_insn (target, virtual_stack_vars_rtx);
+  emit_move_insn (target, base);
   target = expand_binop (Pmode, add_optab, target,
                         gen_int_mode (offset, Pmode),
                         NULL_RTX, 1, OPTAB_LIB_WIDEN);
diff --git a/gcc/sanitizer.def b/gcc/sanitizer.def
index 
a32715ddb92e69b7ca7be28a8f17a369b891bd76..4f854fb994229fd4ed91d3b5cff7c7acff9a55bc
 100644
--- a/gcc/sanitizer.def
+++ b/gcc/sanitizer.def
@@ -180,6 +180,12 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_POINTER_COMPARE, 
"__sanitizer_ptr_cmp",
 DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_POINTER_SUBTRACT, "__sanitizer_ptr_sub",
                      BT_FN_VOID_PTR_PTRMODE, ATTR_NOTHROW_LEAF_LIST)
 
+/* Hardware Address Sanitizer.  */
+DEF_SANITIZER_BUILTIN(BUILT_IN_HWASAN_INIT, "__hwasan_init",
+                     BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_HWASAN_TAG_MEM, "__hwasan_tag_memory",
+                     BT_FN_VOID_PTR_UINT8_PTRMODE, ATTR_NOTHROW_LIST)
+
 /* Thread Sanitizer */
 DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_INIT, "__tsan_init", 
                      BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
diff --git a/gcc/target.def b/gcc/target.def
index 
07064ea366a2c0bde0afbf45d78e16d7e9e9d13d..26183a42aef5f30aae7abcd474843200dd5206eb
 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -6835,10 +6835,73 @@ HOOK_VECTOR (TARGET_MEMTAG_, memtag)
 
 DEFHOOK
 (can_tag_addresses,
- "True if backend architecture naturally supports ignoring the top byte of\n\
- pointers.  This feature means that @option{-fsanitize=hwaddress} can work.",
+ "True if backend architecture naturally supports ignoring some region of\n\
+pointers.  This feature means that @option{-fsanitize=hwaddress} can work.",
  bool, (), default_memtag_can_tag_addresses)
 
+DEFHOOK
+(tag_size,
+ "Return the size of a tag (in bits) for this platform.\n\
+\n\
+The default returns 8.",
+  uint8_t, (), default_memtag_tag_size)
+
+DEFHOOK
+(granule_size,
+ "Return the size in real memory that each byte in shadow memory refers to.\n\
+I.e. if a variable is X bytes long in memory, then this hook should return\n\
+the value Y such that the tag in shadow memory spans X/Y bytes.\n\
+\n\
+Most variables will need to be aligned to this amount since two variables\n\
+that are neighbours in memory and share a tag granule would need to share\n\
+the same tag.\n\
+\n\
+The default returns 16.",
+  uint8_t, (), default_memtag_granule_size)
+
+DEFHOOK
+(insert_random_tag,
+ "Create a new register with the address value of @var{base} and a\n\
+(possibly) random tag in it.\n\
+This function is used to generate a tagged base for the current stack frame.",
+  rtx, (rtx base), default_memtag_insert_random_tag)
+
+DEFHOOK
+(add_tag,
+ "Return an RTX that represents the result of adding @var{addr_offset} to\n\
+the address pointer @var{base} and @var{tag_offset} to the tag in pointer\n\
+@var{base}.\n\
+The resulting RTX must either be a valid memory address or be able to get\n\
+put into an operand with force_operand.\n\
+\n\
+Unlike other memtag hooks, this must return an expression and not emit any\n\
+RTL.",
+  rtx, (rtx base, poly_int64 addr_offset, uint8_t tag_offset),
+  default_memtag_add_tag)
+
+DEFHOOK
+(set_tag,
+ "Return an RTX representing @var{untagged_base} but with the tag @var{tag}.\n\
+Try and store this in @var{target} if convenient.\n\
+@var{untagged_base} is required to have a zero tag when this hook is called.\n\
+The default of this hook is to set the top byte of @var{untagged_base} to\n\
+@var{tag}.",
+  rtx, (rtx untagged_base, rtx tag, rtx target), default_memtag_set_tag)
+
+DEFHOOK
+(extract_tag,
+ "Return an RTX representing the tag stored in @var{tagged_pointer}.\n\
+Store the result in @var{target} if it is convenient.\n\
+The default represents the top byte of the original pointer.",
+  rtx, (rtx tagged_pointer, rtx target), default_memtag_extract_tag)
+
+DEFHOOK
+(untagged_pointer,
+ "Return an RTX representing @var{tagged_pointer} with its tag set to zero.\n\
+Store the result in @var{target} if convenient.\n\
+The default clears the top byte of the original pointer.",
+  rtx, (rtx tagged_pointer, rtx target), default_memtag_untagged_pointer)
+
 HOOK_VECTOR_END (memtag)
 #undef HOOK_PREFIX
 #define HOOK_PREFIX "TARGET_"
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 
c11b1879f65bb581af11963b40189028fa490623..20e16ce86cef124dc96195717352c981b8ed8700
 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -286,4 +286,12 @@ extern bool speculation_safe_value_not_needed (bool);
 extern rtx default_speculation_safe_value (machine_mode, rtx, rtx, rtx);
 
 extern bool default_memtag_can_tag_addresses ();
+extern uint8_t default_memtag_tag_size ();
+extern uint8_t default_memtag_granule_size ();
+extern rtx default_memtag_insert_random_tag (rtx);
+extern rtx default_memtag_add_tag (rtx, poly_int64, uint8_t);
+extern rtx default_memtag_set_tag (rtx, rtx, rtx);
+extern rtx default_memtag_extract_tag (rtx, rtx);
+extern rtx default_memtag_untagged_pointer (rtx, rtx);
+
 #endif /* GCC_TARGHOOKS_H */
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 
8c860f91cae7ccb5ca7fb0ff2364c1f67620d661..5e2b188b09fb21e23be752347a3bcefc2dab62aa
 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -70,6 +70,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "varasm.h"
 #include "flags.h"
 #include "explow.h"
+#include "expmed.h"
 #include "calls.h"
 #include "expr.h"
 #include "output.h"
@@ -83,6 +84,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "langhooks.h"
 #include "sbitmap.h"
 #include "function-abi.h"
+#include "attribs.h"
+#include "asan.h"
+#include "emit-rtl.h"
 
 bool
 default_legitimate_address_p (machine_mode mode ATTRIBUTE_UNUSED,
@@ -2379,10 +2383,130 @@ default_speculation_safe_value (machine_mode mode 
ATTRIBUTE_UNUSED,
   return result;
 }
 
+/* How many bits to shift in order to access the tag bits.
+   The default is to store the tag in the top 8 bits of a 64 bit pointer, hence
+   shifting 56 bits will leave just the tag.
+   We require that the shift is less than 64 for the `const_int_rtx` use here,
+   that's also a requirement for the tag to actually be stored on any systems
+   with 64 bit (or smaller) pointers.  */
+#define HWASAN_SHIFT 56
+#define HWASAN_SHIFT_RTX const_int_rtx[MAX_SAVED_CONST_INT + HWASAN_SHIFT]
+
 bool
 default_memtag_can_tag_addresses ()
 {
   return false;
 }
 
+uint8_t
+default_memtag_tag_size ()
+{
+  return 8;
+}
+
+uint8_t
+default_memtag_granule_size ()
+{
+  return 16;
+}
+
+/* The default implementation of TARGET_MEMTAG_INSERT_RANDOM_TAG.  */
+rtx
+default_memtag_insert_random_tag (rtx untagged)
+{
+  gcc_assert (param_hwasan_instrument_stack);
+  if (param_hwasan_random_frame_tag)
+    {
+      rtx base = gen_reg_rtx (Pmode);
+      rtx temp = gen_reg_rtx (QImode);
+      rtx fn = init_one_libfunc ("__hwasan_generate_tag");
+      rtx new_tag = emit_library_call_value (fn, temp, LCT_NORMAL, QImode);
+      rtx ret = targetm.memtag.set_tag (untagged, new_tag, base);
+      if (ret != base)
+       emit_move_insn (base, ret);
+      return base;
+    }
+  else
+    {
+      /* NOTE: The kernel API does not have __hwasan_generate_tag exposed.
+        In the future we may add the option emit random tags with inline
+        instrumentation instead of function calls.  This would be the same
+        between the kernel and userland.  */
+      return untagged;
+    }
+}
+
+/* The default implementation of TARGET_MEMTAG_ADD_TAG.  */
+rtx
+default_memtag_add_tag (rtx base, poly_int64 offset, uint8_t tag_offset)
+{
+  /* Need to look into what the most efficient code sequence is.
+     This is a code sequence that would be emitted *many* times, so we
+     want it as small as possible.
+
+     There are two places where tag overflow is a question:
+       - Tagging the shadow stack.
+         (both tagging and untagging).
+       - Tagging addressable pointers.
+
+     We need to ensure both behaviors are the same (i.e. that the tag that
+     ends up in a pointer after "overflowing" the tag bits with a tag addition
+     is the same that ends up in the shadow space).
+
+     The aim is that the behavior of tag addition should follow modulo
+     wrapping in both instances.
+
+     The libhwasan code doesn't have any path that increments a pointer's tag,
+     which means it has no opinion on what happens when a tag increment
+     overflows (and hence we can choose our own behavior).
+
+     NOTE:
+       Here we return an expression which represents the base with the
+       provided offsets.
+       This does not have to be a valid operand to anything, since the
+       `force_operand` machinery in the compiler already handles this.  */
+
+  offset += ((uint64_t)tag_offset << HWASAN_SHIFT);
+  return plus_constant (Pmode, base, offset);
+}
+
+/* The default implementation of TARGET_MEMTAG_SET_TAG.  */
+rtx
+default_memtag_set_tag (rtx untagged, rtx tag, rtx target)
+{
+  gcc_assert (GET_MODE (untagged) == Pmode);
+  gcc_assert (GET_MODE (tag) == QImode);
+  tag = expand_simple_binop (Pmode, ASHIFT, tag, HWASAN_SHIFT_RTX, tag,
+                            /* unsignedp = */1, OPTAB_WIDEN);
+  rtx ret = expand_simple_binop (Pmode, IOR, untagged, tag, target,
+                                /* unsignedp = */1, OPTAB_DIRECT);
+  gcc_assert (ret);
+  return ret;
+}
+
+/* The default implementation of TARGET_MEMTAG_EXTRACT_TAG.  */
+rtx
+default_memtag_extract_tag (rtx tagged_pointer, rtx target)
+{
+  rtx tag = expand_simple_binop (Pmode, LSHIFTRT, tagged_pointer,
+                                HWASAN_SHIFT_RTX, target,
+                                /* unsignedp = */0,
+                                OPTAB_DIRECT);
+  rtx ret = gen_lowpart (QImode, tag);
+  gcc_assert (ret);
+  return ret;
+}
+
+/* The default implementation of TARGET_MEMTAG_UNTAGGED_POINTER.  */
+rtx
+default_memtag_untagged_pointer (rtx tagged_pointer, rtx target)
+{
+  rtx tag_mask = gen_int_mode ((HOST_WIDE_INT_1U << HWASAN_SHIFT) - 1, Pmode);
+  rtx untagged_base = expand_simple_binop (Pmode, AND, tagged_pointer,
+                                          tag_mask, target, true,
+                                          OPTAB_DIRECT);
+  gcc_assert (untagged_base);
+  return untagged_base;
+}
+
 #include "gt-targhooks.h"
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 
ac40389b1195d98521cf94e4b8a9553961999f30..9fdd8cfc4b4375e226b2247ad1dd4e9ee31fc7b0
 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -510,6 +510,9 @@ compile_file (void)
       if (flag_sanitize & SANITIZE_THREAD)
        tsan_finish_file ();
 
+      if (flag_sanitize & SANITIZE_HWADDRESS)
+       hwasan_finish_file ();
+
       omp_finish_file ();
 
       hsa_output_brig ();

Attachment: hwasan-stack-variables.patch.gz
Description: application/gzip

Reply via email to