Joerg Roedel wrote: > On Tue, Jan 29, 2008 at 07:20:12PM +0200, Avi Kivity wrote: > > >> Here's a rough sketch of my proposal: >> >> - For every memory slot, allocate an array containing one int for every >> potential large page included within that memory slot. Each entry in >> the array contains the number of write-protected 4KB pages within the >> large page frame corresponding to that entry. >> >> For example, if we have a memory slot for gpas 1MB-1GB, we'd have an >> array of size 511, corresponding to the 511 2MB pages from 2MB upwards. >> If we shadow a pagetable at address 4MB+8KB, we'd increment the entry >> corresponding to the large page at 4MB. When we unshadow that page, >> decrement the entry. >> > > You need to take care the the 2MB gpa is aligned 2 MB host physical to > be able to map it correctly with a large pte. So maybe we need two > memslots for 1MB-1GB. One for 1MB-2MB using normal 4kb pages and one > from 2MB-1GB which can be allocated using HugeTLBfs. > >
Another option is to allocate all memory starting from address zero using hugetlbfs, and pass 0-640K as one memslot and 1MB+ as another. In case the kernel needs to support both methods (e.g. it must handle a memslot that starts in the middle of a large page). >> - If we attempt to shadow a large page (either a guest pse pte, or a >> real-mode pseudo pte), we check if the host page is a large page. If >> so, we also check the write-protect count array. If the result is zero, >> we create a shadow pse pte. >> >> - Whenever we write-protect a page, also zap any large-page mappings for >> that page. This means rmap will need some extension to handle pde rmaps >> in addition to pte rmaps. >> > > This sounds straight forward to me. All you need is a short value for > every potential large page and initialize it with -1 if the host page is > a large page and with 0 otherwise. Every time this value reaches -1 we > can map the page with a large pte (and the guest maps with large pte). > > You don't know whether the host page is a large page in advance. It needs to be checked during pagefault time. >> - qemu is extended to have a command-line option to use large pages to >> back guest memory. >> >> Large pages should improve performance significantly, both with >> traditional shadow and npt/ept. >> > > Yes, I think that too. But with shadow paging it really depends on the > guest if the performance increasement is long-term. In a Linux guest, > for example, the direct mapped memory will become fragmented over > time (together with the location of the page tables). So the > number of potential large page mappings will likely decrease over > time. > > Yes, that's why it is important to be able to fail fast when checking whether we can use a large spte. -- Any sufficiently difficult bug is indistinguishable from a feature. ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel