================ @@ -0,0 +1,387 @@ +========================================= +Lightweight Fault Isolation (LFI) in LLVM +========================================= + +.. contents:: + :local: + +Introduction +++++++++++++ + +Lightweight Fault Isolation (LFI) is a compiler-based sandboxing technology for +native code. Like WebAssembly and Native Client, LFI isolates sandboxed code in-process +(i.e., in the same address space as a host application). + +LFI is designed from the ground up to sandbox existing code, such as C/C++ +libraries (including assembly code) and device drivers. + +LFI aims for the following goals: + +* Compatibility: LFI can be used to sandbox nearly all existing C/C++/assembly + libraries unmodified (they just need to be recompiled). Sandboxed libraries + work with existing system call interfaces, and are compatible with existing + development tools such as profilers, debuggers, and sanitizers. +* Performance: LFI aims for minimal overhead vs. unsandboxed code. +* Security: The LFI runtime and compiler elements aim to be simple and + verifiable when possible. +* Usability: LFI aims to make it easy as possible to used retrofit sandboxing, + i.e., to migrate from unsandboxed to sandboxed libraries with minimal effort. + +When building a program for the LFI target the compiler is designed to ensure +that the program will only be able to access memory within a limited region of +the virtual address space, starting from where the program is loaded (the +current design sets this region to a size of 4GiB of virtual memory). Programs +built for the LFI target are restricted to using a subset of the instruction +set, designed so that the programs can be soundly confined to their sandbox +region. LFI programs must run inside of an "emulator" (usually called the LFI +runtime), responsible for initializing the sandbox region, loading the program, +and servicing system call requests, or other forms of runtime calls. + +LFI uses an architecture-specific sandboxing scheme based on the general +technique of Software-Based Fault Isolation (SFI). Initial support for LFI in +LLVM is focused on the AArch64 platform, with x86-64 support planned for the +future. The initial version of LFI for AArch64 is designed to support the +Armv8.1 AArch64 architecture. + +See `https://github.com/lfi-project <https://github.com/lfi-project/>`__ for +details about the LFI project and additional software needed to run LFI +programs. + +Compiler Requirements ++++++++++++++++++++++ + +When building for the ``aarch64_lfi`` target, the compiler must restrict use of +the instruction set to a subset of instructions, which are known to be safe +from a sandboxing perspective. To do this, we apply a set of simple rewrites at +the assembly language level to transform standard native AArch64 assembly into +LFI-compatible AArch64 assembly. + +These rewrites (also called "expansions") are applied at the very end of the +LLVM compilation pipeline (during the assembler step). This allows the rewrites +to be applied to hand-written assembly, including inline assembly. + +Compiler Options +================ + +The LFI target has several configuration options. + +* ``+lfi-stores``: create a "stores-only" sandbox, where rewrites are not applied to loads. +* ``+lfi-jumps``: create a "jumps-only" sandbox, where rewrites are not applied to loads/stores. + +Reserved Registers +================== + +The LFI target uses a custom ABI that reserves additional registers for the +platform. The registers are listed below, along with the security invariant +that must be maintained. + +* ``x27``: always holds the sandbox base address. +* ``x28``: always holds an address within the sandbox. +* ``sp``: always holds an address within the sandbox. +* ``x30``: always holds an address within the sandbox. +* ``x26``: scratch register. +* ``x25``: points to a thread-local virtual register file for storing runtime context information. + +Linker Support +============== + +In the initial version, LFI only supports static linking, and only supports +creating ``static-pie`` binaries. There is nothing that fundamentally precludes +support for dynamic linking on the LFI target, but such support would require +that the code generated by the linker for PLT entries be slightly modified in +order to conform to the LFI architecture subset. + +Assembly Rewrites +================= + +Terminology +~~~~~~~~~~~ + +In the following assembly rewrites, some shorthand is used. + +* ``xN`` or ``wN``: refers to any general-purpose non-reserved register. +* ``{a,b,c}``: matches any of ``a``, ``b``, or ``c``. +* ``LDSTr``: a load/store instruction that supports register-register addressing modes, with one source/destination register. +* ``LDSTx``: a load/store instruction not matched by ``LDSTr``. + +Control flow +~~~~~~~~~~~~ + +Indirect branches get rewritten to branch through register ``x28``, which must +always contain an address within the sandbox. An ``add`` is used to safely load +``x28`` with the destination address. Since ``ret`` uses ``x30`` by default, +which already must contain an address within the sandbox, it does not require +any rewrite. + ++--------------------+---------------------------+ +| Original | Rewritten | ++--------------------+---------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| {br,blr,ret} xN | add x28, x27, wN, uxtw | +| | {br,blr,ret} x28 | +| | | ++--------------------+---------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| ret | ret | +| | | ++--------------------+---------------------------+ + +Memory accesses +~~~~~~~~~~~~~~~ + +Memory accesses are rewritten to use the ``[x27, wM, uxtw]`` addressing mode if +it is available, which is automatically safe. Otherwise, rewrites fall back to +using ``x28`` along with an instruction to safely load it with the target +address. + ++---------------------------------+-------------------------------+ +| Original | Rewritten | ++---------------------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| LDSTr xN, [xM] | LDSTr xN, [x27, wM, uxtw] | +| | | ++---------------------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| LDSTr xN, [xM, #I] | add x28, x27, wM, uxtw | +| | LDSTr xN, [x28, #I] | +| | | ++---------------------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| LDSTr xN, [xM, #I]! | add xM, xM, #I | +| | LDSTr xN, [x27, wM, uxtw] | +| | | ++---------------------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| LDSTr xN, [xM], #I | LDSTr xN, [x27, wM, uxtw] | +| | add xM, xM, #I | +| | | ++---------------------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| LDSTr xN, [xM1, xM2] | add x26, xM1, xM2 | +| | LDSTr xN, [x27, w26, uxtw] | +| | | ++---------------------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| LDSTr xN, [xM1, xM2, MOD #I] | add x26, xM1, xM2, MOD #I | +| | LDSTr xN, [x27, w26, uxtw] | +| | | ++---------------------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| LDSTx ..., [xM] | add x28, x27, wM, uxtw | +| | LDSTx ..., [x28] | +| | | ++---------------------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| LDSTx ..., [xM, #I] | add x28, x27, wM, uxtw | +| | LDSTx ..., [x28, #I] | +| | | ++---------------------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| LDSTx ..., [xM, #I]! | add x28, x27, wM, uxtw | +| | LDSTx ..., [x28, #I] | +| | add xM, xM, #I | +| | | ++---------------------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| LDSTx ..., [xM], #I | add x28, x27, wM, uxtw | +| | LDSTx ..., [x28] | +| | add xM, xM, #I | +| | | ++---------------------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| LDSTx ..., [xM1], xM2 | add x28, x27, wM1, uxtw | +| | LDSTx ..., [x28] | +| | add xM1, xM1, xM2 | +| | | ++---------------------------------+-------------------------------+ + +Stack pointer modification +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When the stack pointer is modified, we write the modified value to a temporary, +before loading it back into ``sp`` with a safe ``add``. + ++------------------------------+-------------------------------+ +| Original | Rewritten | ++------------------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| mov sp, xN | add sp, x27, wN, uxtw | +| | | ++------------------------------+-------------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| {add,sub} sp, sp, {#I,xN} | {add,sub} x26, sp, {#I,xN} | +| | add sp, x27, w26, uxtw | +| | | ++------------------------------+-------------------------------+ + +Link register modification +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When the link register is modified, we write the modified value to a +temporary, before loading it back into ``x30`` with a safe ``add``. + ++-----------------------+----------------------------+ +| Original | Rewritten | ++-----------------------+----------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| ldr x30, [...] | ldr x26, [...] | +| | add x30, x27, w26, uxtw | +| | | ++-----------------------+----------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| ldp xN, x30, [...] | ldp xN, x26, [...] | +| | add x30, x27, w26, uxtw | +| | | ++-----------------------+----------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| ldp x30, xN, [...] | ldp x26, xN, [...] | +| | add x30, x27, w26, uxtw | +| | | ++-----------------------+----------------------------+ + +System instructions +~~~~~~~~~~~~~~~~~~~ + +System calls are rewritten into a sequence that loads the address of the first +runtime call entrypoint and jumps to it. The runtime call entrypoint table is +stored at the start of the sandbox, so it can be referenced by ``x27``. The +rewrite also saves and restores the link register, since it is used for +branching into the runtime. + ++-----------------+----------------------------+ +| Original | Rewritten | ++-----------------+----------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| svc #0 | mov w26, w30 | +| | ldr x30, [x27] | +| | blr x30 | +| | add x30, x27, w26, uxtw | +| | | ++-----------------+----------------------------+ + +Thread-local storage +~~~~~~~~~~~~~~~~~~~~ + +TLS accesses are rewritten into accesses offset from ``x25``, which is a +reserved register that points to a virtual register file, with a location for +storing the sandbox's thread pointer. ``TP`` is the offset into that virtual +register file where the thread pointer is stored. + ++----------------------+-----------------------+ +| Original | Rewritten | ++----------------------+-----------------------+ +| .. code-block:: | .. code-block:: | +| | | +| mrs xN, tpidr_el0 | ldr xN, [x25, #TP] | +| | | ++----------------------+-----------------------+ +| .. code-block:: | .. code-block:: | +| | | +| mrs tpidr_el0, xN | str xN, [x25, #TP] | +| | | ++----------------------+-----------------------+ + +Optimizations +============= + +Basic guard elimination +~~~~~~~~~~~~~~~~~~~~~~~ + +If a register is guarded multiple times in the same basic block without any +modifications to it during the intervening instructions, then subsequent guards +can be removed. + ++---------------------------+---------------------------+ +| Original | Rewritten | ++---------------------------+---------------------------+ +| .. code-block:: | .. code-block:: | +| | | +| add x28, x27, wN, uxtw | add x28, x27, wN, uxtw | +| ldur xN, [x28] | ldur xN, [x28] | +| add x28, x27, wN, uxtw | ldur xN, [x28, #8] | +| ldur xN, [x28, #8] | ldur xN, [x28, #16] | +| add x28, x27, wN, uxtw | | +| ldur xN, [x28, #16] | | +| | | ++---------------------------+---------------------------+ + ---------------- smithp35 wrote:
Many addresses are formed from a ``` adrp xN, target ldr xN, [xN, imm] ``` If I've understood the rewrite table ``` adrp xN target add x28, x27, wN, uxtw ldr xN, [x28, imm] ``` As the adrp is pc-relative (+- 4GiB) the verifier could check that it was within bounds. Would that be a possible optimisation? Possible I've missed something. https://github.com/llvm/llvm-project/pull/167061 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
