Jens,
Sounds like we're converging on implementation ideas.
I just put a lot of travel and tax forms behind me. Man
I've been itching to get back to coding. Back to the
good stuff...
I think that you're right that we should start by dynamically
mapping in pages. We can easily enhance this by mapping in larger
chunks each time.
-Kevin
Jens Nerche wrote:
>
> Hi,
> while coding on virtual page tables came some ideas...
>
> * I throwed away my first implementation ;)
> The approach was to translate virtual pages and insert them into
> real used monitor page table as soon as they were placed into virtual
> page tables. This produces a lower number of page faults while running
> guest, but the penalty is that all pages were treated - if they were
> used or not, we need more translation time and more memory. Becomes
> worse with SBE and a larger number of used page tables. So I followed
> the approach described in Kevins virtualization paper: handle translation
> and so on not until guest demands them (by causing a page fault).
> Setting a page directory means to unmap the dir and all used page tables
> from guests address space and writing into virtual CR3. Writes and read
> into/from a page dir/table cause page faults, catched by our page fault
> handler. By looking at the linear page fault address and the virtual CPL
> is the right "pager" (which resolves the fault) detected - all this isn't
> new, I just wanted to summarize. Some details I missed, I want to add.
> - Write access to a page dir/table: map page dir/table into guests address
> space, retry the faulting instruction in single step mode, unmap page
> dir/table, done
> - Read access: like write, but we have to update A and D bits. We don't
> know whether the guest reads or writes, so assume read every time
> - Single step is done by setting the T-bit in eflags. T-bit is cleared
> automatically. An alternative would be the use of debug registers, but
> think that's slower.
> - U/S-bit for unmapping from guest is bettern than P-bit, because with
> U/S pages are accessible for monitor, with P not.
> - Software TLBs can speed up monitor (we have to flush guests address space
> with PDBR-changes)
> - Of course, all emulation is done in guest context ;) (my first implementation
> did some things in host context...)
>
> * Monitor may demand some additional pages for emulation. For example, when a new
> page table is needed. This suggests three things:
> - An extended communication protokoll between host and guest
> - Implement mechanisms for suspend emulation in monitor, do some things on host
> side and resume emulation in monitor (nearly done, only a few changes needed)
> - A "page pool" or "page cache" with a minimum and maximum number of free pages;
> Monitor takes pages from this pool for use or gives back unused pages; if
> the pool sank under the minimum, the host has to refill it, if it went up
> above the maximim, the host can take the spare pages for itself (or other
> virtual machines... ;)
>
> I'll go ahead with page tables, write some docs about host<->guest communication
> protocoll, extend this protocoll, implement suspending and resuming emulation in
> monitor (I think about callback functions) and implement the page pool. Hope this
> conflicts with no changes from you. (Kevin?)
>
> jens