To avoid having to sign into github or writing a big comment on an only semi-related CL, I'll put this in an email. I did some digging and I think I know what the problem is from and why your patch fixes it, based on my reading of gem5 and the kernel's source.
First, the guest tries to do a mov from the magic device into %rax. In the VM, that causes a page fault since the magic device isn't memory backed, and that causes a VM exit. The kernel then attempts to emulate the instruction, and discovers that it needs to do an MMIO exit in order to complete emulation. As part of this emulation, the kernel decodes the instruction and determines that the destination operand is a register. There are two bit vectors which represent all of the registers and says if they're dirty and valid. When the destination register is decoded, both bits for the destination register are set, although no actual write has taken place yet since there's no data yet. KVM propagates the exit to userspace, and gem5 does the IPR access (the pseudo instruction) and sets the "data" value to pass back to KVM to be the return code of the pseudo instruction. It also sets a flag which says the thread context is dirty in case it's changed somehow. Later, gem5 gets ready to rereturns to KVM. As part of returning to KVM, gem5 checks to see if it needs to update the registers with KVM and calls an IOCTL to do so since that flag was set earlier. In the kernel, another flag is set that says that the VCPU structure has newer register values than the ones in the emulation context where the two bit vectors described earlier are, and where the return value of the emulated instruction went. When back in KVM in the kernel, the instruction emulator finishes up, taking the newly available data and poking it into the destination register. I wasn't able to identify exactly where that happens, but it's secondary to the actual problem. As part of this process, the emulator realizes that that flag is set which says somebody wrote to the registers from userspace, and it clears the two bit vectors mentioned earlier, invalidating the value we returned from the pseudo instruction. I think this is supposed to protect any register updates userspace made which might be clobbered by the instruction which caused the access? Not entirely sure. Anyway, we eventually fully re-enter the VM, having dropped the return value of the pseudo instruction. So it looks like what we might have to do to fix things is to not (only) return the result code as the result of the memory operation, we may also have to inject it into the right register in the thread context so that even though KVM's instruction emulator allegedly drops the value on the floor, the registers are still updated correctly. Alternatively we could make it a rule that no pseudo instruction is allowed to change any register values, obviating the need to make the thread context dirty and writing registers back to KVM. We really should be tracking whether the thread context is dirty somewhere that actually knows, rather than trying to catch where it could happen and forcing the flag. Anyway, hope that helps! Gabe On Wed, Oct 17, 2018 at 2:51 PM Jason Lowe-Power (Gerrit) < [email protected]> wrote: > Patch Set 1: > > Patch Set 1: > > Patch Set 1: > > Patch Set 1: > > With this patch, I am able to boot up gem5's X86KvmCPU on my intel machine > i7-6700K. > Pseudo instructions fail to work however. > > Pseudo-instructions won't work unless you have a special, memory-mapped, > m5ops implementation. IIRC, Makefile currently enables that by default, so > you might just need to recompile. Old binary versions on gem5.org > probably don't have this enabled by default though. > > Yeah, but the KVM support for the memory mapped m5ops is also broken. > IIRC, the return from the magic instruction isn't written to the > architectural register in gem5 correctly. I have a patch which "fixes" it, > but probably not in the right way. I'll try to post it later today. > > Jason- I believe I am experiencing the issue with memory mapped m5ops. Was > the patch you mentioned posted? > > I don't think the patch is ready for prime time. I need to find time to > dig in and try to remember why it's not working and what this patch > actually does. But here it is: > https://github.com/darchr/gem5/commit/d8041a088fab8ef11a8783a55d25e184265940b0 > > View Change <https://gem5-review.googlesource.com/c/public/gem5/+/12278> > > > To view, visit change 12278 > <https://gem5-review.googlesource.com/c/public/gem5/+/12278>. To > unsubscribe, or for help writing mail filters, visit settings > <https://gem5-review.googlesource.com/settings>. > Gerrit-Project: public/gem5 > Gerrit-Branch: master > Gerrit-Change-Id: I000c7ba102ba161c2bb5e224bf826216cf0ff87a > Gerrit-Change-Number: 12278 > Gerrit-PatchSet: 1 > Gerrit-Owner: Jason Lowe-Power <[email protected]> > Gerrit-Reviewer: Andreas Sandberg <[email protected]> > Gerrit-Reviewer: Gabe Black <[email protected]> > Gerrit-Reviewer: Jason Lowe-Power <[email protected]> > Gerrit-CC: Alexandru Duțu <[email protected]> > Gerrit-CC: Matthew Poremba <[email protected]> > Gerrit-CC: Swapnil Haria <[email protected]> > Gerrit-Comment-Date: Wed, 17 Oct 2018 21:51:53 +0000 > Gerrit-HasComments: No > Gerrit-Has-Labels: No > Gerrit-MessageType: comment > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
