sivadeilra wrote:

> Oh so all this dance (`_ref_` and the additional metadata) is for code page 
> integrity purposes only? To keep them unmodified in memory? So how does then 
> the kernel use the PE metadata if it doesn't patch the code memory pages of 
> the initial (running) image? Is there an additional mechanism for trapping 
> and redirecting calls into the new image? If that's the case, there's no 
> really image patching involved at runtime? The "hot-patching" part is just a 
> _vue d'esprit_ to present the concept to the user?

Sorry, I may have unintentionally mislead.  Let me clarify.

Our workflow for generating patches is this:

1. A vulnerability is identified and the affected functions are identified.
2. Hot-patching requirements are checked (a combination of manual and automated 
checks).  It cannot introduce new DLL imports/exports, cannot change function 
signatures of existing functions (unless they are entirely inlined), cannot add 
new fields to existing types, etc.
3. An "intermediate patch image" is created. This is a normal compilation of a 
complete binary, but with the flags added to the compiler that tell it to 
hot-patch certain functions.  This step is the focus on this PR.  The compiler 
and linker generate a complete executable image, but the hot-patched functions 
use `__ref_*` indirection and the PDB contains a description of the hot-patched 
functions.
4. We run our hot-patch generation tool, which compares the original image 
(called the "base") with the "intermediate" image.  It automatically verifies 
many of our requirements; if those requirements fail, the developer must 
re-evaluate and go back to step 2.  The most common cause is that inlining 
caused a function to be pulled into the hot-patch set.  The output of this step 
is the "hot-patch metadata".
5. The tools modify a copy of the intermediate image and insert the "hot-patch 
metadata". This creates the "final image".  The final image can either be 
loaded as a complete, standalone binary (and will be, when the system reboots) 
or can be used to hot-patch an active instance of the base image.

The metadata describes code patches in _both_ directions, as well data patches 
in _one_ direction.  The code patches modify the base image in memory, using a 
code update idiom that does not require stopping threads or CPUs.  The code 
patches also modify the hot-patch image, so that function calls to non-patched 
code are modified to point back into the base image.  This is done so that 
multiple hot-patches of the same binary can occur; each new hot-patch 
completely replaces the code from the previous hot-patch, although existing 
threads (or CPUs) may continue executing those paths until they return from 
them.  The hot-patch metadata also describes how to set the initial value of 
the `__ref_*` variables to point into the variables in the base image.

So I didn't mean to imply that the sole purpose of the `__ref_*` pointers was 
to avoid modifying the original image, in memory.

The Windows kernel contains the code that interprets the hot-patch metadata, so 
the format and semantics of the hot-patch metadata are determined by that code. 
 This PR is meant to enable Clang (and eventually Rust) to generate code that 
can work in this workflow.

For reference, @dpaoliello and I are the authors of Rust code that executes in 
the Windows kernel.  Our motivation for this work is to enable hot-patching of 
this code and of related non-Rust code in the same images.  Aligning LLVM's 
codegen, in this situation, and providing the S_HOTPATCHFUNC symbol, are a 
necessary part of enabling this whole scenario and in continuing with Rust 
development within the Windows kernel.

https://github.com/llvm/llvm-project/pull/138972
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to