On 02.10.2010, at 03:56, Alexander Graf wrote: > > Am 01.10.2010 um 21:22 schrieb Zachary Amsden <[email protected]>: > >> On 10/01/2010 04:46 AM, Alexander Graf wrote: >>> On 01.10.2010, at 13:21, Nadav Har'El wrote: >>> >>> >>>> On Thu, Sep 30, 2010, Zachary Amsden wrote about "Re: TSC in nested SVM >>>> and VMX": >>>> >>>>> 1) When reading an MSR, we are not emulating the L2 guest; we are >>>>> DIRECTLY reading the MSR for the L1 emulation. Any emulation of the L2 >>>>> guest is actually done by the code running /inside/ the L1 emulation, so >>>>> MSR reads for the L2 guest are handed by L1, and MSR reads for the L1 >>>>> guest are handled by L0, which is this code. >>>>> ... >>>>> So if we are currently running nested, the L1 tsc_offset is stored in >>>>> the nested.hsave field; the vmcb which is active is polluted by the L2 >>>>> guest offset, which would be incorrect to return to the L1 emulation. >>>>> >>>> Thanks for the detailed explanation. >>>> >>>> It seems, then, that the nested VMX logic is somewhat different from that >>>> of the nested SVM. In nested VMX, if a function gets called when running >>>> L1, the current VMCS will be that of L1 (aka vmcs01), not of its guest L2 >>>> (and I'm not even sure *which* L2 that would be when there are multiple >>>> L2 guests for the one L1). >>>> >>> If the #vmexit comes while you're in L1, everything works on the L1's vmcb. >>> If you hit it while in L2, everything works on the L2's vmcb unless special >>> attention is taken. >>> >>> The reason behind the TSC shift is very simple. With the tsc_offset setting >>> we're trying to adjust the L1's offset. Adjusting the L1's offset means we >>> need to adjust L1 and L2 alike, as the virtual L2's offset == L1 offset + >>> vmcb L2 offset, because L2's TSC is also offset by the amount L1 is. >>> >>> So basically what happens is: >>> >>> nested VMRUN: >>> >>> svm->vmcb->control.tsc_offset += nested_vmcb->control.tsc_offset; >>> >>> please note the +=! >>> >>> >>> svm_write_tsc_offset: >>> >>> This gets called when we really want to current level's TSC offset only >>> because the guest issued a tsc write. In L2 this means the L2's value. >>> >>> if (is_nested(svm)) { >>> g_tsc_offset = svm->vmcb->control.tsc_offset - >>> svm->nested.hsave->control.tsc_offset; >>> >>> Remember the difference between L1 and L2. >>> >>> svm->nested.hsave->control.tsc_offset = offset; >>> >>> Set L1 to the new offset >>> >>> } >>> >>> svm->vmcb->control.tsc_offset = offset + g_tsc_offset; >>> >>> Set L2 to new offset + delta. >>> >>> >>> So what this function does is that it treats TSC writes as L1 writes even >>> while in L2 and adjusts L2 accordingly. Joerg, this sounds fishy to me. Are >>> you sure this is intended and works when L1 doesn't intercept MSR writes to >>> TSC? >>> >> >> L1 must intercept MSR writes to TSC for this to work. It does, so all is >> well. > > Sure, in nested kvm all is fine because we becer
never > hit the above code path. But other nypervisors hypervisors > might not intercept tsc writes which should only be reflected in an l2 tsc > offset change, no? Note to self: proof-read mails when writing from a phone. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
