Re: [plex86] Performance enhancement: elminiating mode and contextswitches

Ramon van Handel Mon, 18 Dec 2000 07:30:34 -0800
>We'd have to work out some issues with this method, including
>placing handling code somewhere in the guest CS segment range,
>generating PIC code for the receiving functions, etc.

Well, we could do it with far calls.  Slightly slower but least problems:

(1) put nexus in a special code segment
(2) use far calls to the nexus code segment as virtualisation instructions
    (in stead of int3)
(3) emulate segment register loads / segment overloads in such a way, that
    the nexus segment is never stepped on.

This makes virtualisation fast, and also solves the problem that we would
otherwise have to dump the prescan cache if the nexus is remapped.
Generating PIC code schemes is something I enjoy doing, so I think I'll
work on that :)

Another advantage of this kind of scheme is that it is easy to replace
emulation  on an instruction by dynamic translation (just replace the PIC
code by the dynamically translated  code).  So at a certain point, we can
run a profiler on the system and simply dynamically translate the most
performance-intensive instructions (this is real easy using something like
VCODE.  I love VCODE :))

>  Given when we are prescanning (which would be on
>  for guest kernel code) and running guest code at the safety of ring3,
>  perhaps we could create a special ring3 code segment which contains
>  the handling code to emulate branch and other instructions, and
>  a ring3 interrupt gate in the monitor IDT which allows for that handling
>  code to deal with certain interrupts.  One natural IDT slot would
>  be for the INT3 instruction, which is what we already plant on
>  virtualized instructions.  A ring3 handler could look up the
>  real instruction, and either emulate if it is simple, or defer
>  to the monitor at ring0 if not.  Branch instruction could be emulated
>  here.  Perhaps reads of segment selectors, other simple instructions.
>  Complicated instructions handled by the emulation mechanisms we have already
>  in the monitor, in which case we would have to step up to ring0.

It would be an improvement over what we have now, but I like my method better..
:)

>Though, I do like the more simple Ramon insert-call technique, especially
>because the call target code can be customized for a given
>guest instruction and thus quite simple.  It's worth looking at
>the specifics more.

Okay.  What I will do is this: I will take the standalone prescan program
you posted in february, update it to use the latest prescan code, and then
modify it to use my insert-call technique in stead of int3's.  Once this
looks like it's working reliably, we can consider hauling it into the main
codebase.

It's almost christmas recess over here (just got to attend a couple of
christmas dinners this week :)), so I should have plenty of time to make
something nice.

-- Ramon
Re: [plex86] Performance enhancement: elminiating mode and contextswitches

Reply via email to