On 05/04/2018 23:12, Gabe Black wrote: On Thu, Apr 5, 2018 at 8:14 AM, Andreas Sandberg <[email protected]<mailto:[email protected]>> wrote: <snip>
I've also seen very weird behavior as far as how many instructions KVM thinks are being executed per tick, so I wouldn't discount there being something off about how it's keeping track of time. I haven't been able to attach GDB to the KVM VCPUs for instance, even though it looks like all the pieces are there for that to work. It seems that KVM is supposed to exit after a given number of instructions, but it's just not for some reason. I have used GDB in the past, but the support is very flaky. To use GDB with KVM, I had to force a thread context sync on every KVM entry/exit. You can do this by setting the alwaysSyncTC param, but it will kill your performance. The proper fix for this issue is to implement a custom KVM thread context that lazily synchronises individual registers instead of only synchronising on drain (and some other calls). This sounds to me like you had problems with it giving you valid information or running commands properly. I had problems with it even breaking into gdb in the first place, with the vcpus just running free until gdb gave up. I saw messages about the event which was supposed to cause the CPUs to stop already being scheduled, so I think it was just never getting triggered by the kvm cpu for some reason. We're going to be getting a bigger and better machine to run KVM simulations on in the relatively near future, and my hope is that some of these weird issues magically go away on different hardware. I think there are two classes of problems here, context synchronisation and multi-threaded KVM issues. The description above only really covers the context synchronisation issue. The root cause is that we try to synchronise the TC lazily to avoid the cost of transferring a lot of state between gem5 and the kernel. The simulator tries to keep track of when the TC is dirty by setting threadContextDirty and when KVM has dirty state by setting kvmStateDirty. Whenever gem5 might want to access the TC (e.g., when getContext() is called or on drain()), we call syncThreadContext() that updates the TC if the KVM state is dirty. Conversely, whenever we enter KVM, we update the KVM state if the TC is dirty. If I remember correctly, the GDB was holding on to a pointer to the TC, which meant that the thread context wasn't synchronised properly. That's why it started working when I enabled alwaysSyncTC. I think you might be hitting the other class of problems as well. IIRC, gdb uses the instruction event queue to trigger an exit into gem5. This should work in a single-threaded setup since events are inserted while the CPU isn't running. When entering into KVM, we calculate the number of instructions to execute and call setupInstCounter() to arm a perf counter that triggers an exit after a fixed number of instructions. If the instruction event is inserted from a different thread, we'd need to first stop the CPU and then insert the event to ensure that it is handled correctly. The best solution is probably to force a global barrier as and call kick() on all KVM CPUs to ensure that the exit from KVM. Another option would be to schedule the instruction stop (you'll probably have to lock the CPUs EQ for this) and then call kick() to force the CPU to service the instruction queue. //Andreas IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
