On 17.02.2012, at 22:12, Scott Wood wrote:

> On 02/17/2012 11:13 AM, Alexander Graf wrote:
>> From: Scott Wood <scottw...@freescale.com>
>> 
>> Chips such as e500mc that implement category E.HV in Power ISA 2.06
>> provide hardware virtualization features, including a new MSR mode for
>> guest state.  The guest OS can perform many operations without trapping
>> into the hypervisor, including transitions to and from guest userspace.
>> 
>> Since we can use SRR1[GS] to reliably tell whether an exception came from
>> guest state, instead of messing around with IVPR, we use DO_KVM similarly
>> to book3s.
>> 
>> Current issues include:
>> - Machine checks from guest state are not routed to the host handler.
>> - The guest can cause a host oops by executing an emulated instruction
>>   in a page that lacks read permission.  Existing e500/4xx support has
>>   the same problem.
>> 
>> Includes work by Ashish Kalra <ashish.ka...@freescale.com>,
>> Varun Sethi <varun.se...@freescale.com>, and
>> Liu Yu <yu....@freescale.com>.
>> 
>> Signed-off-by: Scott Wood <scottw...@freescale.com>
>> [agraf: remove pt_regs usage]
>> Signed-off-by: Alexander Graf <ag...@suse.de>
>> ---
> 
> Thanks for picking this up!
> 
>> +static unsigned long get_guest_esr(struct kvm_vcpu *vcpu)
>> +{
>> +#ifdef CONFIG_KVM_BOOKE_HV
>> +    return mfspr(SPRN_ESR);
>> +#else
>> +    return vcpu->arch.shared->esr;
>> +#endif
>> +}
> 
> s/SPRN_ESR/SPRN_GESR/

Ouch :)

> 
>> int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>>                        unsigned int exit_nr)
>> {
>> -    enum emulation_result er;
>>      int r = RESUME_HOST;
>> 
>>      /* update before a new last_exit_type is rewritten */
>>      kvmppc_update_timing_stats(vcpu);
>> 
>> +    switch (exit_nr) {
>> +    case BOOKE_INTERRUPT_EXTERNAL:
>> +            do_IRQ(current->thread.regs);
>> +            break;
> 
> What will current->thread.regs point to here?  Something on the stack
> from the last normal host exception entry?

Yup. Regs only contains a few register values of volatile state that we 
shouldn't care about at this point anyways, right?

> We probably want to create a pt_regs on the stack and at least provide
> PC, LR, and r1 for perfmon interrupts and such.

We don't want guest information to leak into the host kernel, do we? From the 
host's POV, any exit inside the guest happens right at the guest exit path of 
KVM.

The way we handle this on book3s is that we actually jump right into the real 
handler from asm code, right when we recovered all the MMU state. We could do 
the same here - basically move this whole thing off to asm code and jump right 
to the actual interrupt handler, not the above do_IRQ implementation.

Then register state in regs would also be guaranteed to be sane, since it's 
created by the interrupt handler itself.

> 
>> @@ -384,30 +558,56 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
>> kvm_vcpu *vcpu,
>> 
>>      switch (exit_nr) {
>>      case BOOKE_INTERRUPT_MACHINE_CHECK:
>> -            printk("MACHINE CHECK: %lx\n", mfspr(SPRN_MCSR));
>> -            kvmppc_dump_vcpu(vcpu);
>> -            r = RESUME_HOST;
>> +            kvm_resched(vcpu);
>> +            r = RESUME_GUEST;
>>              break;
> 
> Leave this bit out (proper machine check handling will come later).

Ok, I'll readd it with a later patch on top that also sets 
run->hw.hardware_exit_reason to something debugg'y, so user space can abort.

> 
>>      case BOOKE_INTERRUPT_PROGRAM:
>> -            if (vcpu->arch.shared->msr & MSR_PR) {
>> +            if (vcpu->arch.shared->msr & (MSR_PR | MSR_GS)) {
>>                      /* Program traps generated by user-level software must 
>> be handled
>>                       * by the guest kernel. */
>>                      kvmppc_core_queue_program(vcpu, vcpu->arch.fault_esr);
> 
> Should update the comment for why we're checking GS (i.e. we get a
> different trap for emulation with GS-mode).

k

> 
>> +#define SET_VCPU(vcpu)              \
>> +        PPC_STL     vcpu, (THREAD + THREAD_KVM_VCPU)(r2)
> 
> Change spaces to tab before PPC_STL

The whole thing gets removed a few patches later.

> 
>> +#define LONGBYTES           (BITS_PER_LONG / 8)
>> +
>> +#define VCPU_GPR(n)         (VCPU_GPRS + (n * LONGBYTES))
>> +#define VCPU_GUEST_SPRG(n)  (VCPU_GUEST_SPRGS + (n * LONGBYTES))
>> +
>> +/* The host stack layout: */
>> +#define HOST_R1         (0 * LONGBYTES) /* Implied by stwu. */
>> +#define HOST_CALLEE_LR  (1 * LONGBYTES)
>> +#define HOST_RUN        (2 * LONGBYTES) /* struct kvm_run */
>> +/*
>> + * r2 is special: it holds 'current', and it made nonvolatile in the
>> + * kernel with the -ffixed-r2 gcc option.
>> + */
>> +#define HOST_R2         (3 * LONGBYTES)
>> +#define HOST_NV_GPRS    (4 * LONGBYTES)
>> +#define HOST_NV_GPR(n)  (HOST_NV_GPRS + ((n - 14) * LONGBYTES))
>> +#define HOST_MIN_STACK_SIZE (HOST_NV_GPR(31) + LONGBYTES)
>> +#define HOST_STACK_SIZE ((HOST_MIN_STACK_SIZE + 15) & ~15) /* Align. */
>> +#define HOST_STACK_LR   (HOST_STACK_SIZE + LONGBYTES) /* In caller stack 
>> frame. */
>> +
>> +#define NEED_EMU            0x00000001 /* emulation -- save nv regs */
>> +#define NEED_DEAR           0x00000002 /* save faulting DEAR */
>> +#define NEED_ESR            0x00000004 /* save faulting ESR */
>> +
>> +/*
>> + * On entry:
>> + * r4 = vcpu, r5 = srr0, r6 = srr1
>> + * saved in vcpu: cr, ctr, r3-r13
>> + */
>> +.macro kvm_handler_common intno, srr0, flags
>> +    mfspr   r10, SPRN_PID
>> +    lwz     r8, VCPU_HOST_PID(r4)
>> +    PPC_LL  r11, VCPU_SHARED(r4)
>> +    PPC_STL r14, VCPU_GPR(r14)(r4) /* We need a non-volatile GPR. */
>> +    li      r14, \intno
>> +
>> +    stw     r10, VCPU_GUEST_PID(r4)
>> +    mtspr   SPRN_PID, r8
>> +
>> +    .if     \flags & NEED_EMU
>> +    lwz     r9, VCPU_KVM(r4)
>> +    .endif
>> +
>> +#ifdef CONFIG_KVM_EXIT_TIMING
>> +    /* save exit time */
>> +1:  mfspr   r7, SPRN_TBRU
>> +    mfspr   r8, SPRN_TBRL
>> +    mfspr   r9, SPRN_TBRU
>> +    cmpw    r9, r7
>> +    PPC_STL r8, VCPU_TIMING_EXIT_TBL(r4)
>> +    bne-    1b
>> +    PPC_STL r9, VCPU_TIMING_EXIT_TBU(r4)
>> +#endif
> 
> As you pointed out to me last time, r9 is clobbered if exit timing is
> enabled (but see below, the load of VCPU_KVM can be removed along with
> the subsequent load of LVM_LPID(r9)).

Yes, fixed a few patches later. I went with the "put patches on top of your 
patches" style instead of rebasing and modifying your patches too much, so that 
it's easier for you guys to integrate this downstream.

> 
>> +    oris    r8, r6, MSR_CE@h
>> +#ifndef CONFIG_64BIT
>> +    stw     r6, (VCPU_SHARED_MSR + 4)(r11)
>> +#else
>> +    std     r6, (VCPU_SHARED_MSR)(r11)
>> +#endif
>> +    ori     r8, r8, MSR_ME | MSR_RI
>> +    PPC_STL r5, VCPU_PC(r4)
>> +
>> +    /*
>> +     * Make sure CE/ME/RI are set (if appropriate for exception type)
>> +     * whether or not the guest had it set.  Since mfmsr/mtmsr are
>> +     * somewhat expensive, skip in the common case where the guest
>> +     * had all these bits set (and thus they're still set if
>> +     * appropriate for the exception type).
>> +     */
>> +    cmpw    r6, r8
>> +    .if     \flags & NEED_EMU
>> +    lwz     r9, KVM_LPID(r9)
>> +    .endif
> 
> Where do we use r9?  This is probably left over from something old.

Looks like it, yup.


Alex

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Reply via email to