Re: Update [PATCH 6/X] libsanitizer: Add hwasan pass and associated gimple changes

2020-11-20 Thread Richard Sandiford via Gcc-patches
Matthew Malcomson  writes:
> @@ -7877,6 +7903,26 @@ gimple_build_vector (gimple_seq *seq, location_t loc,
>return builder->build ();
>  }
>  
> +/* Emit gimple statements into  that take a value given in `old_size`
> +   and generate a value guaranteed to be rounded upwards to `align`.
> +
> +   Return the tree node representing this size, it is of TREE_TYPE `type`.  
> */

Nit, but: the usual way of referring to parameter names is to use caps
(OLD_SIZE, ALIGN, TYPE) rather than backticks.  I don't think it's
necessary to change the hwasan-specific code to follow that style,
since what you have is self-consistent and readable as-is (although
changing it would be fine too if you prefer).  But since the surrounding
code consistently follows the caps style, I think it would be better to
use it here too.

OK with that change, thanks.

> +
> +tree
> +gimple_build_round_up (gimple_seq *seq, location_t loc, tree type,
> +tree old_size, unsigned HOST_WIDE_INT align)
> +{
> +  unsigned HOST_WIDE_INT tg_mask = align - 1;
> +  /* tree new_size = (old_size + tg_mask) & ~tg_mask;  */
> +  gcc_assert (INTEGRAL_TYPE_P (type));
> +  tree tree_mask = build_int_cst (type, tg_mask);
> +  tree oversize = gimple_build (seq, loc, PLUS_EXPR, type, old_size,
> + tree_mask);
> +
> +  tree mask = build_int_cst (type, -align);
> +  return gimple_build (seq, loc, BIT_AND_EXPR, type, oversize, mask);
> +}
> +
>  /* Return true if the result of assignment STMT is known to be non-negative.
> If the return value is based on the assumption that signed overflow is
> undefined, set *STRICT_OVERFLOW_P to true; otherwise, don't change

Richard


Re: Update [PATCH 6/X] libsanitizer: Add hwasan pass and associated gimple changes

2020-11-20 Thread Matthew Malcomson via Gcc-patches

Updates after latest review.
(testing underway)

---
There are four main features to this change:

1) Check pointer tags match address tags.

When sanitizing for hwasan we now put HWASAN_CHECK internal functions before
memory accesses in the `asan` pass.  This checks that a tag in the pointer
being used match the tag stored in shadow memory for the memory region being
used.

These internal functions are expanded into actual checks in the sanopt
pass that happens just before expansion into RTL.

We use the same mechanism that currently inserts ASAN_CHECK internal
functions to insert the new HWASAN_CHECK functions.

2) Instrument known builtin function calls.

Handle all builtin functions that we know use memory accesses.
This commit uses the machinery added for ASAN to identify builtin
functions that access memory.

The main differences between the approaches for HWASAN and ASAN are:
 - libhwasan intercepts much less builtin functions.
 - Alloca needs to be transformed differently (instead of adding
   redzones it needs to tag shadow memory and return a tagged pointer).
 - stack_restore needs to untag the shadow stack between the current
   position and where it's going.
 - `noreturn` functions can not be handled by simply unpoisoning the
   entire shadow stack -- there is no "always valid" tag.
   (exceptions and things such as longjmp need to be handled in a
   different way, usually in the runtime).

For hardware implemented checking (such as AArch64's memory tagging
extension) alloca and stack_restore will need to be handled by hooks in
the backend rather than transformation at the gimple level.  This will
allow architecture specific handling of such stack modifications.

3) Introduce HWASAN block-scope poisoning

Here we use exactly the same mechanism as ASAN_MARK to poison/unpoison
variables on entry/exit of a block.

In order to simply use the exact same machinery we're using the same
internal functions until the SANOPT pass.  This means that all handling
of ASAN_MARK is the same.
This has the negative that the naming may be a little confusing, but a
positive that handling of the internal function doesn't have to be
duplicated for a function that behaves exactly the same but has a
different name.

gcc/ChangeLog:

* asan.c (asan_instrument_reads): New.
(asan_instrument_writes): New.
(asan_memintrin): New.
(handle_builtin_stack_restore): Account for HWASAN.
(handle_builtin_alloca): Account for HWASAN.
(get_mem_refs_of_builtin_call): Special case strlen for HWASAN.
(hwasan_instrument_reads): New.
(hwasan_instrument_writes): New.
(hwasan_memintrin): New.
(report_error_func): Assert not HWASAN.
(build_check_stmt): Make HWASAN_CHECK instead of ASAN_CHECK.
(instrument_derefs): HWASAN does not tag globals.
(instrument_builtin_call): Use new helper functions.
(maybe_instrument_call): Don't instrument `noreturn` functions.
(initialize_sanitizer_builtins): Add new type.
(asan_expand_mark_ifn): Account for HWASAN.
(asan_expand_check_ifn): Assert never called by HWASAN.
(asan_expand_poison_ifn): Account for HWASAN.
(asan_instrument): Branch based on whether using HWASAN or ASAN.
(pass_asan::gate): Return true if sanitizing HWASAN.
(pass_asan_O0::gate): Return true if sanitizing HWASAN.
(hwasan_check_func): New.
(hwasan_expand_check_ifn): New.
(hwasan_expand_mark_ifn): New.
(gate_hwasan): New.
* asan.h (hwasan_expand_check_ifn): New decl.
(hwasan_expand_mark_ifn): New decl.
(gate_hwasan): New decl.
(asan_intercepted_p): Always false for hwasan.
(asan_sanitize_use_after_scope): Account for HWASAN.
* builtin-types.def (BT_FN_PTR_CONST_PTR_UINT8): New.
* gimple-fold.c (gimple_build): New overload for building function
calls without arguments.
(gimple_build_round_up): New.
* gimple-fold.h (gimple_build): New decl.
(gimple_build): New inline function.
(gimple_build_round_up): New decl.
(gimple_build_round_up): New inline function.
* gimple-pretty-print.c (dump_gimple_call_args): Account for
HWASAN.
* gimplify.c (asan_poison_variable): Account for HWASAN.
(gimplify_function_tree): Remove requirement of
SANITIZE_ADDRESS, requiring asan or hwasan is accounted for in
`asan_sanitize_use_after_scope`.
* internal-fn.c (expand_HWASAN_CHECK): New.
(expand_HWASAN_ALLOCA_UNPOISON): New.
(expand_HWASAN_CHOOSE_TAG): New.
(expand_HWASAN_MARK): New.
(expand_HWASAN_SET_TAG): New.
* internal-fn.def (HWASAN_ALLOCA_UNPOISON): New.
(HWASAN_CHOOSE_TAG): New.
(HWASAN_CHECK): New.
(HWASAN_MARK): New.
(HWASAN_SET_TAG): New.
* sanitizer.def (BUILT_IN_HWASAN_LOAD1): New.

Re: Update [PATCH 6/X] libsanitizer: Add hwasan pass and associated gimple changes

2020-11-19 Thread Richard Sandiford via Gcc-patches
Matthew Malcomson  writes:
> +/* Emit gimple statements into  that take the size given in `len` and
> +   generate a size that is guaranteed to be rounded upwards to `align`.
> +
> +   This is a helper function for both handle_builtin_alloca and
> +   asan_expand_mark_ifn when using HWASAN.
> +
> +   Return the tree node representing this size, it is of TREE_TYPE
> +   size_type_node.  */
> +
> +static tree
> +hwasan_emit_round_up (gimple_seq *seq, location_t loc, tree old_size,
> +   uint8_t align)
> +{
> +  uint8_t tg_mask = align - 1;
> +  /* tree new_size = (old_size + tg_mask) & ~tg_mask;  */
> +  tree tree_mask = build_int_cst (size_type_node, tg_mask);
> +  tree oversize = gimple_build (seq, loc, PLUS_EXPR, size_type_node, 
> old_size,
> + tree_mask);
> +
> +  tree mask = build_int_cst (size_type_node, -align);
> +  return gimple_build (seq, loc, BIT_AND_EXPR, size_type_node, oversize, 
> mask);
> +}
> +

There's nothing really hwasan-specific about this, apart from the choice
“uint8_t” for the alignment and mask.  So I think we should:

- chnage “align” and “tg_mask” to “unsigned HOST_WIDE_INT”
- change the name to “gimple_build_round_up”
- take the type as a parameter, in the same position as other
  gimple_build_* type parameters
- move the function to gimple-fold.c, exported via gimple-fold.h
- drop:

   This is a helper function for both handle_builtin_alloca and
   asan_expand_mark_ifn when using HWASAN.

> […]
> @@ -690,6 +757,71 @@ handle_builtin_alloca (gcall *call, gimple_stmt_iterator 
> *iter)
>  = DECL_FUNCTION_CODE (callee) == BUILT_IN_ALLOCA
>? 0 : tree_to_uhwi (gimple_call_arg (call, 1));
>  
> +  if (hwasan_sanitize_allocas_p ())
> +{
> +  gimple_seq stmts = NULL;
> +  location_t loc = gimple_location (gsi_stmt (*iter));
> +  /*
> +  HWASAN needs a different expansion.
> +
> +  addr = __builtin_alloca (size, align);
> +
> +  should be replaced by
> +
> +  new_size = size rounded up to HWASAN_TAG_GRANULE_SIZE byte alignment;
> +  untagged_addr = __builtin_alloca (new_size, align);
> +  tag = __hwasan_choose_alloca_tag ();
> +  addr = ifn_HWASAN_SET_TAG (untagged_addr, tag);
> +  __hwasan_tag_memory (untagged_addr, tag, new_size);
> + */
> +  /* Ensure alignment at least HWASAN_TAG_GRANULE_SIZE bytes so we start 
> on
> +  a tag granule.  */
> +  align = align > HWASAN_TAG_GRANULE_SIZE ? align : 
> HWASAN_TAG_GRANULE_SIZE;
> +
> +  tree old_size = gimple_call_arg (call, 0);
> +  tree new_size = hwasan_emit_round_up (, loc, old_size,
> + HWASAN_TAG_GRANULE_SIZE);
> +
> +  /* Make the alloca call */
> +  tree untagged_addr
> + = gimple_build (, loc,
> + as_combined_fn (BUILT_IN_ALLOCA_WITH_ALIGN), ptr_type,
> + new_size, build_int_cst (size_type_node, align));
> +
> +  /* Choose the tag.
> +  Here we use an internal function so we can choose the tag at expand
> +  time.  We need the decision to be made after stack variables have been
> +  assigned their tag (i.e. once the hwasan_frame_tag_offset variable has
> +  been set to one after the last stack variables tag).  */
> +  gcall *stmt = gimple_build_call_internal (IFN_HWASAN_CHOOSE_TAG, 0);
> +  tree tag = make_ssa_name (unsigned_char_type_node);
> +  gimple_call_set_lhs (stmt, tag);
> +  gimple_set_location (stmt, loc);
> +  gimple_seq_add_stmt_without_update (, stmt);

Even though there are currently no folds defined for argumentless
functions, I think it would be worth adding a gimple_build overload
for this instead of writing it out by hand.  I.e. have a zero-argument
version of:

tree
gimple_build (gimple_seq *seq, location_t loc, combined_fn fn,
  tree type, tree arg0)
{
  tree res = gimple_simplify (fn, type, arg0, seq, gimple_build_valueize);
  if (!res)
{
  gcall *stmt;
  if (internal_fn_p (fn))
stmt = gimple_build_call_internal (as_internal_fn (fn), 1, arg0);
  else
{
  tree decl = builtin_decl_implicit (as_builtin_fn (fn));
  stmt = gimple_build_call (decl, 1, arg0);
}
  if (!VOID_TYPE_P (type))
{
  res = create_tmp_reg_or_ssa_name (type);
  gimple_call_set_lhs (stmt, res);
}
  gimple_set_location (stmt, loc);
  gimple_seq_add_stmt_without_update (seq, stmt);
}
  return res;
}

without the gimple_simplify call.

> +
> +  /* Add tag to pointer.  */
> +  tree addr
> + = gimple_build (, loc, as_combined_fn (IFN_HWASAN_SET_TAG),

This is CFN_HWASAN_SET_TAG.

> + ptr_type, untagged_addr, tag);
> +
> +  /* Tag shadow memory.
> +  NOTE: require using `untagged_addr` here for libhwasan API.  */
> +  gimple_build (, loc, as_combined_fn (BUILT_IN_HWASAN_TAG_MEM),
> + void_type_node, untagged_addr, tag, 

Update [PATCH 6/X] libsanitizer: Add hwasan pass and associated gimple changes

2020-11-19 Thread Matthew Malcomson via Gcc-patches

Update to match the change in initialisation for patch 5/X.

MM

---

There are four main features to this change:

1) Check pointer tags match address tags.

When sanitizing for hwasan we now put HWASAN_CHECK internal functions before
memory accesses in the `asan` pass.  This checks that a tag in the pointer
being used match the tag stored in shadow memory for the memory region being
used.

These internal functions are expanded into actual checks in the sanopt
pass that happens just before expansion into RTL.

We use the same mechanism that currently inserts ASAN_CHECK internal
functions to insert the new HWASAN_CHECK functions.

2) Instrument known builtin function calls.

Handle all builtin functions that we know use memory accesses.
This commit uses the machinery added for ASAN to identify builtin
functions that access memory.

The main differences between the approaches for HWASAN and ASAN are:
 - libhwasan intercepts much less builtin functions.
 - Alloca needs to be transformed differently (instead of adding
   redzones it needs to tag shadow memory and return a tagged pointer).
 - stack_restore needs to untag the shadow stack between the current
   position and where it's going.
 - `noreturn` functions can not be handled by simply unpoisoning the
   entire shadow stack -- there is no "always valid" tag.
   (exceptions and things such as longjmp need to be handled in a
   different way, usually in the runtime).

For hardware implemented checking (such as AArch64's memory tagging
extension) alloca and stack_restore will need to be handled by hooks in
the backend rather than transformation at the gimple level.  This will
allow architecture specific handling of such stack modifications.

3) Introduce HWASAN block-scope poisoning

Here we use exactly the same mechanism as ASAN_MARK to poison/unpoison
variables on entry/exit of a block.

In order to simply use the exact same machinery we're using the same
internal functions until the SANOPT pass.  This means that all handling
of ASAN_MARK is the same.
This has the negative that the naming may be a little confusing, but a
positive that handling of the internal function doesn't have to be
duplicated for a function that behaves exactly the same but has a
different name.

gcc/ChangeLog:

* asan.c (asan_instrument_reads): New.
(asan_instrument_writes): New.
(asan_memintrin): New.
(handle_builtin_stack_restore): Account for HWASAN.
(hwasan_emit_round_up): New.
(handle_builtin_alloca): Account for HWASAN.
(get_mem_refs_of_builtin_call): Special case strlen for HWASAN.
(hwasan_instrument_reads): New.
(hwasan_instrument_writes): New.
(hwasan_memintrin): New.
(report_error_func): Assert not HWASAN.
(build_check_stmt): Make HWASAN_CHECK instead of ASAN_CHECK.
(instrument_derefs): HWASAN does not tag globals.
(instrument_builtin_call): Use new helper functions.
(maybe_instrument_call): Don't instrument `noreturn` functions.
(initialize_sanitizer_builtins): Add new type.
(asan_expand_mark_ifn): Account for HWASAN.
(asan_expand_check_ifn): Assert never called by HWASAN.
(asan_expand_poison_ifn): Account for HWASAN.
(asan_instrument): Branch based on whether using HWASAN or ASAN.
(pass_asan::gate): Return true if sanitizing HWASAN.
(pass_asan_O0::gate): Return true if sanitizing HWASAN.
(hwasan_check_func): New.
(hwasan_expand_check_ifn): New.
(hwasan_expand_mark_ifn): New.
(gate_hwasan): New.
* asan.h (hwasan_expand_check_ifn): New decl.
(hwasan_expand_mark_ifn): New decl.
(gate_hwasan): New decl.
(asan_intercepted_p): Always false for hwasan.
(asan_sanitize_use_after_scope): Account for HWASAN.
* builtin-types.def (BT_FN_PTR_CONST_PTR_UINT8): New.
* gimple-pretty-print.c (dump_gimple_call_args): Account for
HWASAN.
* gimplify.c (asan_poison_variable): Account for HWASAN.
(gimplify_function_tree): Remove requirement of
SANITIZE_ADDRESS, requiring asan or hwasan is accounted for in
`asan_sanitize_use_after_scope`.
* internal-fn.c (expand_HWASAN_CHECK): New.
(expand_HWASAN_ALLOCA_UNPOISON): New.
(expand_HWASAN_CHOOSE_TAG): New.
(expand_HWASAN_MARK): New.
(expand_HWASAN_SET_TAG): New.
* internal-fn.def (HWASAN_ALLOCA_UNPOISON): New.
(HWASAN_CHOOSE_TAG): New.
(HWASAN_CHECK): New.
(HWASAN_MARK): New.
(HWASAN_SET_TAG): New.
* sanitizer.def (BUILT_IN_HWASAN_LOAD1): New.
(BUILT_IN_HWASAN_LOAD2): New.
(BUILT_IN_HWASAN_LOAD4): New.
(BUILT_IN_HWASAN_LOAD8): New.
(BUILT_IN_HWASAN_LOAD16): New.
(BUILT_IN_HWASAN_LOADN): New.
(BUILT_IN_HWASAN_STORE1): New.
(BUILT_IN_HWASAN_STORE2): New.
(BUILT_IN_HWASAN_STORE4): New.