Kevin Lawton wrote:
> All or most of the device emulation (video board, hard drive,
> keyboard, etc) will be done at the user program level, and
> we'll make use of the standard C library interfaces and such
> to implement these.
>
> The reason I say most, is that for performance reasons,
> parts of all of some particular devices such as the timer
> and interrupt controller chips can likely be moved into
> the monitor domain. As was talked about before this would
> alleviate a lot of context switching between the host/guest
> contexts. We don't have to do this kind of thing right
> away. Though, it's worth pointing out that parts of
> quite a few devices can be moved into the monitor. For
> example the floppy controller could be done in the monitor,
> the floppy drive in the user program. The VGA adapter
> in the monitor, the CRT display in the user app. Etc. etc.
About the timers I agree that putting them in the monitor
is probably best (it's not such a big shame design-wise,
either: newer x86's and clones have built-in timing
facilities.) However, why would you want to stick VGA
in the monitor ? Or the floppy ? I'd rather keep the
design clean --- these devices are not so time-critical
that I expect putting them in the monitor will make that
much of a difference...
> Anyways, so we need some kind of accurate time reference and
> timer services from this virtualization framework. For example,
> to emulate the CMOS RTC, you need to be notified once per second
> so you can update the clock.
Not really... reads/writes to the RTC will probably trap
back into the host OS (usermode) virtualisation code,
which can then use the host Os's timing facilities to
get the correct time (gettimeofday()). The other thing
you can do with the RTC are periodical interrupts (not
neccessarily with the period of one second), which can
probably be translated into the monitor-code timing
facilities.
> So our approach could go something like this. Each time,
> just before the monitor hands over execution to the guest
> code, we take a snapshot of time, using the RDTSC instruction.
> Linux even defines an asm macro for this. :^)
> Upon the next time invocation of our monitor code (via
> an interrupt or exception) we take a 2nd sampling using
> the same instruction. Now we have an accurate time sample
> of how long the guest code actually ran without intervention.
> We pass this duration to the timer framework. If there are
> requests from the device models to be notified given the elapsed
> time, then we call them. If they live in the user app world,
> then we return back to the user app, which sees this as a return
> from the ioctl() call, and some fields are filled in, like how
> long we ran for etc. I we were wicked perfectionists, we could
> subtract off the number of cycles it takes to get the guest code
> started again, and for the exception to occur from our RDTSC
> values.
There's a catch here: if we only count the time that the guest
code actually ran, then we're completely out of sync with real
time, which isn't good either: I mean, say we virtualise
linux on linux. Now if the virtual copy of linux runs a
program that executes sleep(1), then we DO want that the
actual sleep time somewhat resembles one second, which, if I
understood it correctly, your method cannot do.
We should look into using DOSEMU-style time-contraction.
> So if we wanted highly accurate timing, we need a mechanism for
> interrupting us in the middle. Fortunately, the built-in
> APIC on the Pentium has a timer based on CPU clock speed which
> can do this. It can be programmed to either periodic or one-shot
> mode. (thanks to one of the developers for suggesting use of
> this timer facility)
The problem is that we still don't know whether this timer
is present on AMD. I fear that it may not be.
Ramon