On 15.12.2010, at 01:00, Scott Wood wrote:
> On Wed, 15 Dec 2010 00:29:40 +0100
> Alexander Graf <[email protected]> wrote:
>
>> On 15.12.2010, at 00:17, Scott Wood wrote:
>>
>>> So that once KVM has an interrupt to deliver, and sees that critical is
>>> engaged, it knows that the next magic page store will resolve things.
>>> Either it is a store to critical, and KVM can now deliver the
>>> interrupt -- or it is some other store (scratch or MSR itself) and thus
>>> int_pending has not yet been checked.
>>>
>>> I don't think it would be a problem for live patching. It just seems a
>>> bit icky.
>>
>> Oh, because you'd only trap stores, but no writes? Yep, that would work.
>
> "writes" or "loads"? :-)
s/writes/loads/ :). Sorry, it's 1am here :).
>
>> I actually like that idea. It's probably the cleanest we can get away with
>> without deep modifications of the guest. Single-step is always icky.
>
> Well, there's another complication -- if we trap on the final store to
> end the critical section, the critical section won't actually be ended
> until after that instruction executes. Which won't happen until we set
> the page to read/write and let it go. So we'd have to look at the
> instruction to see what it's doing.
Yep, which is why I proposed the thing below. We'd have to emulate the uncrit
store and then automatically inject the interrupt because we're not in the crit
section anymore.
>
>> Thinking about the whole thing - can't we create an "interrupt notification
>> page"? Some page that is always mapped read-only when interrupts are
>> available, but read-write when they're not? > Then we could just do an
>> unconditional store after the crit section is done and everyone's happy.
>
> I'd limit it to interrupts that were deferred due to critical,
> to avoid unnecessary MMU manipulation, and unnecessary traps when doing
> mtmsr/wrtee if there's an interrupt pending and old EE = new EE = zero
> (assuming the guest doesn't use a separate restore path for that case).
Hrm, we already do treat EE 0 -> 1 changes differently in the code:
/* Check if we have to fetch an interrupt */
lwz r31, (KVM_MAGIC_PAGE + KVM_MAGIC_INT)(0)
cmpwi r31, 0
beq+ no_check
/* Check if we may trigger an interrupt */
andi. r30, r30, MSR_EE
beq no_check
SCRATCH_RESTORE
/* Nag hypervisor */
kvm_emulate_mtmsrd_orig_ins:
tlbsync
b kvm_emulate_mtmsrd_branch
So the only thing we need to do is to get rid of the MAGIC_INT check and
unconditionally store a random register into the notification page
(-2*PAGE_SIZE) instead of the tlbsync instruction (sure, things there are
slightly more compllicated because we actually use the real mtmsr, but you get
the point).
>
> But otherwise sounds reasonable, if we're willing to change the
> interface that much. Does it even need to be read-only, or could it be
> entirely unmapped when there's a pending interrupt?
I don't see why we shouldn't add this to the interface. But I'd leave this in
the back of our heads for a couple more days so in case we end up coming up
with an even better idea, we rather implement that one :)
It could be read-only or unmapped - doesn't really matter. It has to be a
store, because otherwise we'd clobber a guest register. It could however be a
single physical page shared by all guests. We don't care about the contents
written into that page anyways.
Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html