Re: [PATCH v6 7/9] kfence, Documentation: add KFENCE documentation

2020-10-30 Thread Alexander Potapenko
On Fri, Oct 30, 2020 at 3:50 AM Jann Horn  wrote:
>
> On Thu, Oct 29, 2020 at 2:17 PM Marco Elver  wrote:
> > Add KFENCE documentation in dev-tools/kfence.rst, and add to index.
> [...]
> > +The KFENCE memory pool is of fixed size, and if the pool is exhausted, no
> > +further KFENCE allocations occur. With ``CONFIG_KFENCE_NUM_OBJECTS`` 
> > (default
> > +255), the number of available guarded objects can be controlled. Each 
> > object
> > +requires 2 pages, one for the object itself and the other one used as a 
> > guard
> > +page; object pages are interleaved with guard pages, and every object page 
> > is
> > +therefore surrounded by two guard pages.
> > +
> > +The total memory dedicated to the KFENCE memory pool can be computed as::
> > +
> > +( #objects + 1 ) * 2 * PAGE_SIZE
>
> Plus memory overhead from shattered hugepages. With the default object
> count, on x86, we allocate 2MiB of memory pool, but if we have to
> shatter a 2MiB hugepage for that, we may cause the allocation of one
> extra page table, or 4KiB. Of course that's pretty much negligible.
> But on arm64 it's worse, because there we have to disable hugepages in
> the linear map completely. So on a device with 4GiB memory, we might
> end up with something on the order of 4GiB/2MiB * 0x1000 bytes = 8MiB
> of extra L1 page tables that wouldn't have been needed otherwise -
> significantly more than the default memory pool size.

Note that with CONFIG_RODATA_FULL_DEFAULT_ENABLED (which is on by
default now) these hugepages are already disabled (see patch 3/9)

> If the memory overhead is documented, this detail should probably be
> documented, too.

But, yes, documenting that also makes sense.

> > +Using the default config, and assuming a page size of 4 KiB, results in
> > +dedicating 2 MiB to the KFENCE memory pool.
> [...]
> > +For such errors, the address where the corruption as well as the invalidly
>
> nit: "the address where the corruption occurred" or "the address of
> the corruption"
>
> > +written bytes (offset from the address) are shown; in this representation, 
> > '.'
> > +denote untouched bytes. In the example above ``0xac`` is the value written 
> > to
> > +the invalid address at offset 0, and the remaining '.' denote that no 
> > following
> > +bytes have been touched. Note that, real values are only shown for
> > +``CONFIG_DEBUG_KERNEL=y`` builds; to avoid information disclosure for 
> > non-debug
> > +builds, '!' is used instead to denote invalidly written bytes.
> [...]
> > +KFENCE objects each reside on a dedicated page, at either the left or right
> > +page boundaries selected at random. The pages to the left and right of the
> > +object page are "guard pages", whose attributes are changed to a protected
> > +state, and cause page faults on any attempted access. Such page faults are 
> > then
> > +intercepted by KFENCE, which handles the fault gracefully by reporting an
> > +out-of-bounds access.
>
> ... and marking the page as accessible so that the faulting code can
> continue (wrongly) executing.
>
>
> [...]
> > +Interface
> > +-
> > +
> > +The following describes the functions which are used by allocators as well 
> > page
>
> nit: "as well as"?
>
>
>
> > +handling code to set up and deal with KFENCE allocations.



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg


Re: [PATCH v6 7/9] kfence, Documentation: add KFENCE documentation

2020-10-29 Thread Jann Horn
On Thu, Oct 29, 2020 at 2:17 PM Marco Elver  wrote:
> Add KFENCE documentation in dev-tools/kfence.rst, and add to index.
[...]
> +The KFENCE memory pool is of fixed size, and if the pool is exhausted, no
> +further KFENCE allocations occur. With ``CONFIG_KFENCE_NUM_OBJECTS`` (default
> +255), the number of available guarded objects can be controlled. Each object
> +requires 2 pages, one for the object itself and the other one used as a guard
> +page; object pages are interleaved with guard pages, and every object page is
> +therefore surrounded by two guard pages.
> +
> +The total memory dedicated to the KFENCE memory pool can be computed as::
> +
> +( #objects + 1 ) * 2 * PAGE_SIZE

Plus memory overhead from shattered hugepages. With the default object
count, on x86, we allocate 2MiB of memory pool, but if we have to
shatter a 2MiB hugepage for that, we may cause the allocation of one
extra page table, or 4KiB. Of course that's pretty much negligible.
But on arm64 it's worse, because there we have to disable hugepages in
the linear map completely. So on a device with 4GiB memory, we might
end up with something on the order of 4GiB/2MiB * 0x1000 bytes = 8MiB
of extra L1 page tables that wouldn't have been needed otherwise -
significantly more than the default memory pool size.

If the memory overhead is documented, this detail should probably be
documented, too.

> +Using the default config, and assuming a page size of 4 KiB, results in
> +dedicating 2 MiB to the KFENCE memory pool.
[...]
> +For such errors, the address where the corruption as well as the invalidly

nit: "the address where the corruption occurred" or "the address of
the corruption"

> +written bytes (offset from the address) are shown; in this representation, 
> '.'
> +denote untouched bytes. In the example above ``0xac`` is the value written to
> +the invalid address at offset 0, and the remaining '.' denote that no 
> following
> +bytes have been touched. Note that, real values are only shown for
> +``CONFIG_DEBUG_KERNEL=y`` builds; to avoid information disclosure for 
> non-debug
> +builds, '!' is used instead to denote invalidly written bytes.
[...]
> +KFENCE objects each reside on a dedicated page, at either the left or right
> +page boundaries selected at random. The pages to the left and right of the
> +object page are "guard pages", whose attributes are changed to a protected
> +state, and cause page faults on any attempted access. Such page faults are 
> then
> +intercepted by KFENCE, which handles the fault gracefully by reporting an
> +out-of-bounds access.

... and marking the page as accessible so that the faulting code can
continue (wrongly) executing.


[...]
> +Interface
> +-
> +
> +The following describes the functions which are used by allocators as well 
> page

nit: "as well as"?



> +handling code to set up and deal with KFENCE allocations.


[PATCH v6 7/9] kfence, Documentation: add KFENCE documentation

2020-10-29 Thread Marco Elver
Add KFENCE documentation in dev-tools/kfence.rst, and add to index.

Reviewed-by: Dmitry Vyukov 
Co-developed-by: Alexander Potapenko 
Signed-off-by: Alexander Potapenko 
Signed-off-by: Marco Elver 
---
v3:
* Re-introduce reference to Documentation/dev-tools/kfence.rst.

v2:
* Many clarifications based on comments from Andrey Konovalov.
* Document CONFIG_KFENCE_SAMPLE_INTERVAL=0 usage.
* Make use-cases between KASAN and KFENCE clearer.
* Be clearer about the fact the pool is fixed size.
* Update based on reporting changes.
* Explicitly mention max supported allocation size is PAGE_SIZE.
---
 Documentation/dev-tools/index.rst  |   1 +
 Documentation/dev-tools/kfence.rst | 291 +
 lib/Kconfig.kfence |   2 +
 3 files changed, 294 insertions(+)
 create mode 100644 Documentation/dev-tools/kfence.rst

diff --git a/Documentation/dev-tools/index.rst 
b/Documentation/dev-tools/index.rst
index f7809c7b1ba9..1b1cf4f5c9d9 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -22,6 +22,7 @@ whole; patches welcome!
ubsan
kmemleak
kcsan
+   kfence
gdb-kernel-debugging
kgdb
kselftest
diff --git a/Documentation/dev-tools/kfence.rst 
b/Documentation/dev-tools/kfence.rst
new file mode 100644
index ..f0ee8db1bf87
--- /dev/null
+++ b/Documentation/dev-tools/kfence.rst
@@ -0,0 +1,291 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Kernel Electric-Fence (KFENCE)
+==
+
+Kernel Electric-Fence (KFENCE) is a low-overhead sampling-based memory safety
+error detector. KFENCE detects heap out-of-bounds access, use-after-free, and
+invalid-free errors.
+
+KFENCE is designed to be enabled in production kernels, and has near zero
+performance overhead. Compared to KASAN, KFENCE trades performance for
+precision. The main motivation behind KFENCE's design, is that with enough
+total uptime KFENCE will detect bugs in code paths not typically exercised by
+non-production test workloads. One way to quickly achieve a large enough total
+uptime is when the tool is deployed across a large fleet of machines.
+
+Usage
+-
+
+To enable KFENCE, configure the kernel with::
+
+CONFIG_KFENCE=y
+
+To build a kernel with KFENCE support, but disabled by default (to enable, set
+``kfence.sample_interval`` to non-zero value), configure the kernel with::
+
+CONFIG_KFENCE=y
+CONFIG_KFENCE_SAMPLE_INTERVAL=0
+
+KFENCE provides several other configuration options to customize behaviour (see
+the respective help text in ``lib/Kconfig.kfence`` for more info).
+
+Tuning performance
+~~
+
+The most important parameter is KFENCE's sample interval, which can be set via
+the kernel boot parameter ``kfence.sample_interval`` in milliseconds. The
+sample interval determines the frequency with which heap allocations will be
+guarded by KFENCE. The default is configurable via the Kconfig option
+``CONFIG_KFENCE_SAMPLE_INTERVAL``. Setting ``kfence.sample_interval=0``
+disables KFENCE.
+
+The KFENCE memory pool is of fixed size, and if the pool is exhausted, no
+further KFENCE allocations occur. With ``CONFIG_KFENCE_NUM_OBJECTS`` (default
+255), the number of available guarded objects can be controlled. Each object
+requires 2 pages, one for the object itself and the other one used as a guard
+page; object pages are interleaved with guard pages, and every object page is
+therefore surrounded by two guard pages.
+
+The total memory dedicated to the KFENCE memory pool can be computed as::
+
+( #objects + 1 ) * 2 * PAGE_SIZE
+
+Using the default config, and assuming a page size of 4 KiB, results in
+dedicating 2 MiB to the KFENCE memory pool.
+
+Error reports
+~
+
+A typical out-of-bounds access looks like this::
+
+==
+BUG: KFENCE: out-of-bounds in test_out_of_bounds_read+0xa3/0x22b
+
+Out-of-bounds access at 0xb672efff (1B left of kfence-#17):
+ test_out_of_bounds_read+0xa3/0x22b
+ kunit_try_run_case+0x51/0x85
+ kunit_generic_run_threadfn_adapter+0x16/0x30
+ kthread+0x137/0x160
+ ret_from_fork+0x22/0x30
+
+kfence-#17 [0xb672f000-0xb672f01f, size=32, 
cache=kmalloc-32] allocated by task 507:
+ test_alloc+0xf3/0x25b
+ test_out_of_bounds_read+0x98/0x22b
+ kunit_try_run_case+0x51/0x85
+ kunit_generic_run_threadfn_adapter+0x16/0x30
+ kthread+0x137/0x160
+ ret_from_fork+0x22/0x30
+
+CPU: 4 PID: 107 Comm: kunit_try_catch Not tainted 5.8.0-rc6+ #7
+Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 
04/01/2014
+==
+
+The header of the report provides a short summary of the function involved in
+the access. It is followed by more detailed information about the access and
+its origin. Note that, real kernel addresses are only shown for
+``CONFIG_DEBUG_KERNEL=y``