Re: [PATCH v7 1/9] mm: add Kernel Electric-Fence infrastructure

2020-11-03 Thread Jann Horn
On Tue, Nov 3, 2020 at 6:58 PM Marco Elver  wrote:
> This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a
> low-overhead sampling-based memory safety error detector of heap
> use-after-free, invalid-free, and out-of-bounds access errors.

Reviewed-by: Jann Horn 


[PATCH v7 1/9] mm: add Kernel Electric-Fence infrastructure

2020-11-03 Thread Marco Elver
From: Alexander Potapenko 

This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a
low-overhead sampling-based memory safety error detector of heap
use-after-free, invalid-free, and out-of-bounds access errors.

KFENCE is designed to be enabled in production kernels, and has near
zero performance overhead. Compared to KASAN, KFENCE trades performance
for precision. The main motivation behind KFENCE's design, is that with
enough total uptime KFENCE will detect bugs in code paths not typically
exercised by non-production test workloads. One way to quickly achieve a
large enough total uptime is when the tool is deployed across a large
fleet of machines.

KFENCE objects each reside on a dedicated page, at either the left or
right page boundaries. The pages to the left and right of the object
page are "guard pages", whose attributes are changed to a protected
state, and cause page faults on any attempted access to them. Such page
faults are then intercepted by KFENCE, which handles the fault
gracefully by reporting a memory access error. To detect out-of-bounds
writes to memory within the object's page itself, KFENCE also uses
pattern-based redzones. The following figure illustrates the page
layout:

  ---+---+---+---+---+---+---
 | x | O :   | x |   : O | x |
 | x | B :   | x |   : B | x |
 | x GUARD x | J : RED-  | x GUARD x | RED-  : J | x GUARD x |
 | x | E :  ZONE | x |  ZONE : E | x |
 | x | C :   | x |   : C | x |
 | x | T :   | x |   : T | x |
  ---+---+---+---+---+---+---

Guarded allocations are set up based on a sample interval (can be set
via kfence.sample_interval). After expiration of the sample interval, a
guarded allocation from the KFENCE object pool is returned to the main
allocator (SLAB or SLUB). At this point, the timer is reset, and the
next allocation is set up after the expiration of the interval.

To enable/disable a KFENCE allocation through the main allocator's
fast-path without overhead, KFENCE relies on static branches via the
static keys infrastructure. The static branch is toggled to redirect the
allocation to KFENCE. To date, we have verified by running synthetic
benchmarks (sysbench I/O, hackbench) that a kernel compiled with KFENCE
is performance-neutral compared to the non-KFENCE baseline.

For more details, see Documentation/dev-tools/kfence.rst (added later in
the series).

Reviewed-by: Dmitry Vyukov 
Reviewed-by: SeongJae Park 
Co-developed-by: Marco Elver 
Signed-off-by: Marco Elver 
Signed-off-by: Alexander Potapenko 
---
v7:
* Reports by Jann Horn:
  * Comment about is_kfence_address() performance.
  * Cleaner CONFIG_KFENCE_STRESS_TEST_FAULTS, using "if EXPERT".
  * Fix comment in metadata_to_pageaddr().
  * Fix comment above get_stack_skipnr().
  * Update comment for for_each_canary().
  * Clean up print_diff_canary() boundary calculation.
  * Remove SLUB/SLAB specific code and move to later patches.
* Make __kfence_free() part of the public API.

v6:
* Record allocation and free task pids, and show them in reports. This
  information helps more easily identify e.g. racy use-after-frees.

v5:
* MAJOR CHANGE: Removal of HAVE_ARCH_KFENCE_STATIC_POOL and static pool
  support in favor of memblock_alloc'd pool only, as it avoids all issues
  with virt_to translations. With the new optimizations to
  is_kfence_address(), we measure no noticeable performance impact.
* Verify we do not end up with a compound head page.
* Fix reporting of corruptions to never show object contents.
* Reformat kfence_alloc [suggested by Mark Rutland].
* Taint with TAINT_BAD_PAGE, to distinguish memory errors from regular
  warnings (also used by SL*B/KASAN/etc. for memory errors).
* Show OOB offset bytes in report.
* Rework kfence_shutdown_cache().
* Set page fields to fix obj_to_index+objs_per_slab_page.
* Suggestions/Reports by Jann Horn:
  * Move generic page allocation code to core.c.
  * Use KERN_ERR for dump_stack_print_info.
  * Make __kfence_pool pointer __ro_after_init.
  * Fix typos.
  * Add likely hint for check_canary_byte.
  * Make for_each_canary __always_inline.
  * Add comment about IPIs for static key toggling.
  * Check for non-null pointer in is_kfence_address(), in case KFENCE is never 
initialized.
  * Rework sample_interval parameter dynamic setting semantics.
  * Fix redzone checking.
  * Optimize is_kfence_address() by using better in-range check.

v4:
* Make static memory pool's attrs entirely arch-dependent.
* Revert MAINTAINERS, and make separate patch.
* Fix report generation if __slab_free tail-called.

v3:
* Reports by SeongJae Park:
  * Remove reference to Documentation/dev-tools/kfence.rst.
  * Remove redundant braces.
  * Use CONFIG_KFENCE_NUM_OBJECTS instead of ARRAY_SIZE(...).
  * Align some