On 05/04/2018 03:42, Gabe Black wrote:
Hi folks. I'm continuing to try to iron out problems with KVM on ARM, and
the problem I'm working on specifically right now is that the mouse device
gets spurious bad command bytes which panics gem5.

What I've found so far is that the guest kernel will frequently time out
while waiting for an ACK to a byte it sent to the mouse, even though the
timeout looks like it should be 200ms, the simulation quantum I'm using is
1ms, and the delay between an event and the corresponding interrupt is
configured to be 1us. I think this eventually throws the PS2 driver out of
whack, and it ends up sending a data byte (or something else?) to the mouse
which the mouse misinterprets as a command, causing the panic.

Last time I looked at this, I suspected that the PS/2 model wasn't
clearing some interrupts. The GIC model in gem5 normally doesn't worry
about that and raises an interrupt every time someone calls the
sendInt(). The behaviour I have observed from the kernel is that it
doesn't post a new interrupt unless you first clear the old interrupt.
This caused some issues with a other models in the past (IIRC, the UART).

Make sure you test this in a single-threaded simulator as well to avoid
other weirdness due to thread syncrhonisation in gem5. I assume you're
already doing this though.

My current theory for why that's happening is that even when the VM is not
running, the hardware supported virtual timer the CPU may have scheduled to
keep track of its timeout may be "running" in the sense that the kernel
will update it to reflect the descheduled time once the VM is running
again. That could mean that 200ms of real time could pass, looking like
200ms of simulated time to the VCPU even if a smaller amount of actual
execution time was supposed to happen. I'm not sure if that's a correct
interpretation, but this ASPLOS paper *seems* to say something like that is
possible.

http://www.cs.columbia.edu/~cdall/pubs/asplos019-dall.pdf

I have never been happy with the way we handle the timer on the Arm KVM
CPUs. It's possible to re-sync the virtual counter when entering into
KVM.  A simple way to test that would be to update KVM_REG_ARM_TIMER_CNT
/ MISCREG_CNTVCT whenever entering into KVM. The Linux side should
update the virtual timer offset when you write an absolute time to this
register.

This should work for Linux, but you might have issues with other OSes
that insist on using the physical timer instead of the virtual timer.

I've also seen very weird behavior as far as how many instructions KVM
thinks are being executed per tick, so I wouldn't discount there being
something off about how it's keeping track of time. I haven't been able to
attach GDB to the KVM VCPUs for instance, even though it looks like all the
pieces are there for that to work. It seems that KVM is supposed to exit
after a given number of instructions, but it's just not for some reason.

I have used GDB in the past, but the support is very flaky. To use GDB
with KVM, I had to force a thread context sync on every KVM entry/exit.
You can do this by setting the alwaysSyncTC  param, but it will kill
your performance. The proper fix for this issue is to implement a custom
KVM thread context that lazily synchronises individual registers instead
of only synchronising on drain (and some other calls).

Cheers,
Andreas

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to