Ulrich Weigand wrote:
> As the guest could be using any of a large number of instructions
> to access the page tables (and other virtualized structures) --
> basically any instruction that accesses memory -- we'll probably
> need a quite complete x86 instruction emulator to do so ...
>
> (We'll probably need that anyway, I just wanted to point it out.)
That's one way to do it. Though, if you didn't want to emulate
any of the user instructions which access memory, I suggest
stepping through the instruction like a debugger, while temporarily
changing either the page mapping to the real copy of the data,
or just temporarily modifying the bytes accessed.
This way, you only emulate system instructions which you can't
execute natively.
For example, we have an add instruction:
add [some_address], #0x12345678
Let's say we have separate pages, one for the guest data, one for
the private data. Initially the page points to our private
copy. The add instructions hits a page fault and we do
this...
* save current page mapping of some_address (private copy)
* change page mapping of some_address to guest data with user access
* INVLPG (some_address)
* step 1 instruction (add)
* restore page mapping to private page
* INVLPG (some_address)
* act on change of data accordingly
We can grab the linear address of the page in question out of CR2
in the page fault handler.
If the data item at some_address and some other virtualized
data structures cohabitates the same page, we can use
a related technique to feed the add instruction the data
it should see:
* save data at some_address
* change data at some_address if needed to what guest should see
* step 1 instruction (add)
* act on data change
* restore data if need be
Anyways, this is a good lazy way to not have to add
emulation of any normal user instructions. Could add
emulation later to accelerate things if it makes sense.
What this technique doesn't allow us to do is to
protect against guest reads to our private virtualized
structures.
> Eh, but how to you propose to prevent the guest OS from
> corrupting monitor code/data pages? While guest code is
> running, at least the IDT and the interrupt handlers
> must be mapped into the virtual address space. This means
> that these pages can be modified by any code running in
> supervisor mode, which would include the guest OS ...
>
> Thus, the guest OS could write itself an interrupt gate
> to ring 0 into the IDT, and completely take control of
> the computer ...
>
> [ Well, I guess this might be prevented by thoroughly
> pre-scanning the guest OS code, but then again, it
> seems difficult (how to decide whether an arbitrary
> write to a computed address might hit the real IDT
> or not?) ]
Mark pages in question with read-only. Now guest code
running at ring1 will generate exceptions, but interrupt
code will run. Even the page tables would have to be
read-only while running the guest. This is all
assuming we run with the CR0.WP flag set.
In order to modify things in the IDT code, remark pages
back to read/write. To do this, we have to diddle the
CR0.WP flag to override the protection. Or do some
funkiness reloading the PDBR. Either of these things
is protected against in ring1 code. Before returning
from IDT code back to guest code, set things back
to read-only.
> > A more complex approach would be to save virtualized
> > page table information across PDBR reloads. The idea
> > here is that when the guest OS schedules a task who's
> > page tables are already virtualized and stored, we can
> > save a number of page faults and execution of associated
> > monitor code, which would otherwise be incurred from
> > the dynamic rebuilding of the page tables.
>
> Of course, you need to take into account that the page
> tables might be modified while they are *not* currently
> active ... Thus, you can't simply re-use the old monitor
> page tables if the guest reloads a PDBR value that we've
> already seen in the past.
>
> So, you either have to examine the page tables in detail
> anyway, to check for potential modification, or else you
> track that memory for modifications even while the it is
> *not* actively used as page tables. (How do you detect
> that the page table has been destroyed for good, and the
> memory is reused for something completely different?)
Very true. We'd need to keep tabs on any memory regions
which we had cached page tables for, whether active or
not.
There is no way to know when a page table goes out
of commission. Probably allocate memory for a certain
number of virtualized page tables. When we want to create
a new one, just dump one that is the least used.
-Kevin