Kevin Lawton wrote:

> I'm currently adding framework to plex86 to handle guest OS paging.
> After that, it should be fairly straight-forward to
> fill things out to boot Linux, given we load it straight
> into memory and bypass the BIOS.

I've been looking at your latest (tarfile) version, and IMO we should
try to get this into CVS as soon as possible.  Having a CVS that lags
behind current development (in significant ways) is not very useful ;-)

I've cleaned up your version a bit (removing now obsolete cruft like
the RET_BECAUSE_ handling), but I'd prefer to have the new version 
support at least the same major functionality as the current CVS 
before replacing it.  The only thing missing in that respect is the 
hardware interrupt handling, AFAICS (and the debugger features to
intercept/reflect interrupts).

Did you implement interrupt handling in the meantime?  Otherwise, I'd
add this and put the new version into CVS ..


B.t.w. I noticed one point the seems a little odd:  when performing
SBE, you keep the original PTE stored in a global variable and restore
it on return to the monitor.  IMO this isn't necessary: as soon as
you have placed a TLB entry pointing to the virtualized page into the
I TLB, you can immediately replace the original PTE, only with the U/S
bit toggled off.  That way, guest access to the page would still trap, 
and the monitor is able to access the (original) page without needing
to restore anything.  The advantage is that you don't need a global
variable to store anything, which simplifies the use of more than one
virtualized page at the same time.

I've attached a correspondingly modified version of sbe(), which appears
to be working fine.  Let me know if I've overlooked anything ...

(Another minor point: when using %dl explicitly in a gcc inline assembly
statement, it is not enough to mark edx as clobbered, you need to use
an *early clobber* modifier, otherwise gcc feels free to use edx for an
input parameter :-/   This I found out the hard way ;-) )


  void
sbe(guest_context_t context)
{
    nexus_t *nexus = (nexus_t *) (((Bit32u) &context) & 0xfffff000);
    vm_t    *vm    = (vm_t *) nexus->vm;

    Bit32u guest_laddr, guest_paddr, temp;
    unsigned seg32;
    unsigned icache_index;
    unsigned pdi, pti;
    pageEntry_t pte_new, *codepage_pte_p;
    Bit32u laddr, codepage_laddr, private_codepage_laddr;

    if ( (vm->guest_cpu.cr0.u.raw & 0x80000001) != 0x1 ) 
        monpanic(vm, "sbe: not pg=0,pe=1\n");

    /* Return early if SBE not currently active */
    if (!vm->sbe) return;
    
    cache_sreg(vm, SRegCS);
    guest_laddr = vm->guest_cpu.desc_cache[SRegCS].base + context.eip;
    guest_paddr = guest_laddr;  /* No paging supported yet, identity map */

    /* Make sure guest physical address is within bounds. */
    if (guest_paddr >= vm->pages.guest_n_bytes)
      monpanic(vm, "sbe: phy addr OOB\n");

    /* Make sure physical page is mapped into memory before accessing. */
// +++ optimize >> 12 stuff
    map_guest_phy_page(vm, guest_paddr);
    validate_page_timestamp(vm, guest_paddr);
    if ( !(vm->page_usage[guest_paddr>>12].attr & GuestPageVCode) )
      vm->page_usage[guest_paddr>>12].attr |= GuestPageVCode;

    /* Perform prescanning */
    seg32 = vm->guest_cpu.desc_cache[SRegCS].desc.d_b;
    icache_index = prescan(vm, guest_paddr, seg32, G_CPL(vm), &opcode_vmap, 1);

    /* Update timestamp for latest time of guest execution */
    vm->code_cache[icache_index].ts.guest_executed = vm_rdtsc();

    /* 
     * Split I&D TLB trick.  Our scan-before-execute technique
     * creates a private modified (virtualized) copy of the current
     * code page.  We want to be able to execute from this private
     * code page, yet allow memory accesses to see the original
     * code/data page so code does not detect our changes.  Since
     * the IA32 processor does not offer a native mechanism to
     * differentiate between read and code accesses within a page,
     * we make use of the split I&D TLB nature.  What we need to
     * do is to get the TLB to load a value with user privilege
     * into the I TLB cache, then reset the corresponding page
     * table entry to contain supervisor privilege.  Thus code
     * will continue to execute based on the user privilege level,
     * yet data accesses will fetch the entry from the actual
     * page tables and cause an exception.
     */

    /* The linear address of the real code page */
    laddr = guest_laddr & 0xfffff000;
    codepage_laddr = Guest2Monitor(vm, laddr);

    /* The linear address of the virtualized code page */
    private_codepage_laddr =
      ((Bit32u)vm->guest.addr.icache) + (icache_index<<12);

    /* The address of the PTE pointing to the real code page */
    pdi = laddr >> 22;
    pti = (laddr >> 12) & 0x3ff;
    codepage_pte_p = (pageEntry_t *)
        (((Bit32u)vm->guest.addr.page_tbl) + (pdi<<12) + (pti<<2));

    /* The temporary PTE pointing to the virtualized page */
    pte_new.base = vm->pages.icache[icache_index];
    pte_new.avail = 0;
    pte_new.G = 0;      /* not global           */
    pte_new.PS = 0;     /* (unused in pte)      */
    pte_new.D = 0;      /* clean                */
    pte_new.A = 0;      /* not accessed         */
    pte_new.PCD = 0;    /* normal caching       */
    pte_new.PWT = 0;    /* normal write-back    */
    pte_new.US = 1;     /* user                 */
    pte_new.RW = 0;     /* read-only            */
    pte_new.P = 1;      /* present in memory    */


    asm volatile 
    (
        /* Write 'virtual' page table entry, saving the real one. */
        "xchgl %4, (%3)\n\t"

        /* Save byte at beginning of virtualized code page and replace
           it with the opcode for RET, using the alternate address. */
        "movb $0xc3, %%dl\n\t"
        "xchgb %%dl, (%1)\n\t"

        /* Invalidate the page entry for the real code page address (I&D). */
        "invlpg (%2)\n\t"

        /* Call the RET instruction (just comes right back) using the actual
           code page address; this loads the TLB (I-only). */
        "call *%2\n\t"

        /* Replace the RET instruction with the original byte, using
           the alternate address. */
        "xchgb %%dl, (%1)\n\t"

        /* Restore the 'real' page table entry, activating protection. */
        "andl $0xfffffffb, %4\n\t"
        "movl %4, (%3)\n\t"

        : "=&d" (temp)
        : "r" (private_codepage_laddr), "r" (codepage_laddr),
          "r" (codepage_pte_p), "r" (pte_new)
        : "memory"
    );
}


Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  [EMAIL PROTECTED]

Reply via email to