On 01.02.2013, at 23:38, Scott Wood wrote:

> On 01/31/2013 06:11:32 PM, Alexander Graf wrote:
>> On 31.01.2013, at 23:40, Scott Wood wrote:
>> > On 01/31/2013 01:20:39 PM, Alexander Graf wrote:
>> >> On 31.01.2013, at 20:05, Alexander Graf wrote:
>> >> >
>> >> > On 31.01.2013, at 19:54, Scott Wood wrote:
>> >> >
>> >> >> On 01/31/2013 12:52:41 PM, Alexander Graf wrote:
>> >> >>> On 31.01.2013, at 19:43, Scott Wood wrote:
>> >> >>>> On 01/31/2013 12:21:07 PM, Alexander Graf wrote:
>> >> >>>>> How about something like this? Then both targets at least suck as 
>> >> >>>>> much :).
>> >> >>>>
>> >> >>>> I'm not sure that should be the goal...
>> >> >>>>
>> >> >>>>> Thanks to e500mc's awful hardware design, we don't know who sets 
>> >> >>>>> the MSR_DE bit. Once we forced it onto the guest, we have no change 
>> >> >>>>> to know whether the guest also set it or not. We could only guess.
>> >> >>>>
>> >> >>>> MSRP[DEP] can prevent the guest from modifying MSR[DE] -- but we 
>> >> >>>> still need to set it in the first place.
>> >> >>>>
>> >> >>>> According to ISA V2.06B, the hypervisor should set DBCR0[EDM] to let 
>> >> >>>> the guest know that the debug resources are not available, and that 
>> >> >>>> "the value of MSR[DE] is not specified and not modifiable".
>> >> >>> So what would the guest do then to tell the hypervisor that it 
>> >> >>> actually wants to know about debug events?
>> >> >>
>> >> >> The guest is out of luck, just as if a JTAG were in use.
>> >> >
>> >> > Hrm.
>> >> >
>> >> > Can we somehow generalize this "out of luck" behavior?
>> >> >
>> >> > Every time we would set or clear an MSR bit in shadow_msr on e500v2, we 
>> >> > would instead set or clear it in the real MSR. That way only e500mc is 
>> >> > out of luck, but the code would still be shared.
>> >
>> > I don't follow.  e500v2 is just as out-of-luck.  The mechanism simply does 
>> > not support sharing debug resources.
>> For e500v2 we have 2 fields
>>  * MSR as the guest sees it
>>  * MSR as we execute when the guest runs
>> Since we know the MSR when the guest sees it, we can decide what to do when 
>> we get an unhandled debug interrupt.
> 
> That's not the same thing as making the real MSR[DE] show up in the guest 
> MSR[DE].
> 
> There are other problems with sharing -- what happens when both host and 
> guest try to write to a particular IAC or DAC?
> 
> Also, performance would be pretty awful if the guest has e.g. single stepping 
> in DBCR0 enabled but MSR[DE]=0, and the host doesn't care about single 
> stepping (but does want debugging enabled in general).
> 
>> > What do you mean by "the real MSR"?  The real MSR is shadow_msr, and 
>> > MSR_DE must always be set there if the host is debugging the guest.  As 
>> > for reflecting it into the guest MSR, we could, but I don't really see the 
>> > point.  We're never going to actually send a debug exception to the guest 
>> > when the host owns the debug resources.
>> Why not? That's the whole point of jumping through user space.
> 
> That's still needed for software breakpoints, which don't rely on the debug 
> resources.
> 
>>  1) guest exits with debug interrupt
>>  2) QEMU gets a debug exit
>>  3) QEMU checks in its list whether it belongs to its own debug points
>>  4) if not, it reinjects the interrupt into the guest
>> Step 4 is pretty difficult to do when we don't know whether the guest is 
>> actually capable of handling debug interrupts at that moment.
> 
> Software breakpoints take a Program interrupt rather than a Debug interrupt, 
> unless MSR[DE]=1 and DBCR0[TRAP]=1.  If the guest does not own debug 
> resources we should always send it to the Program interrupt, so MSR[DE] 
> doesn't matter.
> 
>> > The "&= ~MSR_DE" line is pointless on bookehv, and makes it harder to 
>> > read.  I had to stare at it a while before noticing that you initially set 
>> > is_debug from the guest MSR and that you'd never really clear MSR_DE here 
>> > on bookehv.
>> Well, I'm mostly bouncing ideas here to find a way to express what we're 
>> trying to say in a way that someone who hasn't read this email thread would 
>> still understand what's going on :).
> 
> I think it's already straightforward enough if you accept that shared debug 
> resources aren't supported, and that we are either in a mode where the real 
> MSR[DE] reflects the guest MSR[DE], or a mode where the real MSR[DE] is 
> always on in guest mode and the guest MSR[DE] is irrelevant.

I think I'm starting to grasp what you're suggesting:

On e500mc, have 2 modes

  1) guest owns debug

  This is the normal operation. Here the guest defines the value of MSR_DE. The 
guest gets debug interrupts directly.

  2) host owns debug

  In this case, take away any debug capabilities from the guest. Everything 
debug related goes straight to QEMU.


On e500v2, have 2 modes

  1) guest owns debug

  This is the normal operation. Here the guest defines the value of MSR_DE. The 
guest gets debug interrupts directly.

  2) host and guest share debug

  We want to enable debug always. Guest and host debug desire DBCR, IAC, DAC 
registers are merged by QEMU. All debug events go to QEMU, QEMU reinjects 
interrupts when they really were meant for the guest.


The reason that the e500v2 approach doesn't work on e500mc is that e500mc 
doesn't support trapping MSR_DE writes.

Is there any foreseeable way in future hardware that the e500v2 way would work 
on HV capable chips as well? If not, we should probably just ditch that 
approach, make e500v2 as limited as e500mc and keep the code simple.

>From a kvm <-> user space interface point of view, we want to keep the door 
>open for the e500v2 approach though, even though we wouldn't implement it for 
>now.


Alex

> 
>> How about this version?
>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>> index 38a62ef..9929c41 100644
>> --- a/arch/powerpc/kvm/booke.c
>> +++ b/arch/powerpc/kvm/booke.c
>> @@ -133,6 +133,28 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu)
>> #endif
>> }
>> +static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu)
>> +{
>> +#ifndef CONFIG_KVM_BOOKE_HV
>> +    /* Synchronize guest's desire to get debug interrupts into shadow MSR */
>> +    vcpu->arch.shadow_msr &= ~MSR_DE;
>> +    vcpu->arch.shadow_msr |= vcpu->arch.shared->msr & MSR_DE;
>> +#endif
>> +
>> +    /* Force enable debug interrupts when user space wants to debug */
>> +    if (vcpu->guest_debug) {
>> +#ifdef CONFIG_KVM_BOOKE_HV
>> +            /*
>> +             * Since there is no shadow MSR, sync MSR_DE into the guest
>> +             * visible MSR.
>> +             */
>> +            vcpu->arch.shared->msr |= MSR_DE;
>> +#else
>> +            vcpu->arch.shadow_msr |= MSR_DE;
>> +#endif
>> +    }
>> +}
> 
> This shows "guest's desire to get debug interrupts" in a context that is not 
> specifically for !vcpu->guest_debug, which is misleading.
> 
>> +
>> /*
>>  * Helper function for "full" MSR writes.  No need to call this if only
>>  * EE/CE/ME/DE/RI are changing.
>> @@ -150,6 +172,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr)
>>      kvmppc_mmu_msr_notify(vcpu, old_msr);
>>      kvmppc_vcpu_sync_spe(vcpu);
>>      kvmppc_vcpu_sync_fpu(vcpu);
>> +    kvmppc_vcpu_sync_debug(vcpu);
>> }
>> static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu,
>> My main concern here is that we don't know when to remove MSR_DE again from 
>> the (shadow) MSR. So how about this one instead?
> 
> Why wouldn't you know this?  if (vcpu->guest_debug) { you never remove it } 
> else { just copy whatever's in guest MSR }
> 
>> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
>> index 38a62ef..2676703 100644
>> --- a/arch/powerpc/kvm/booke.c
>> +++ b/arch/powerpc/kvm/booke.c
>> @@ -142,7 +142,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr)
>>      u32 old_msr = vcpu->arch.shared->msr;
>> #ifdef CONFIG_KVM_BOOKE_HV
>> -    new_msr |= MSR_GS;
>> +    new_msr |= MSR_GS | MSR_DE;
>> #endif
>>      vcpu->arch.shared->msr = new_msr;
>> That would semantically move e500mc to the same logic as e500v2. With the 
>> main difference that we have no idea what MSR_DE value the guest really 
>> wanted to have set.
> 
> This would break the case where the guest owns the debug resources.
> 
>> If I read the spec correctly, rfci traps.
> 
> rfdi is the relevant one for e500mc, but yes.
> 
>> So we know the time frame from [inject debug interrupt ... rfci]. During 
>> that time we know for sure that the guest thinks MSR_DE is 0.
> 
> No, we don't.  The guest could have tried to use mtmsr or rfi to enable 
> MSR[DE].  It could have seen the context it came from was userspace, and 
> scheduled to another process, etc.
> 
>> Outside of that context, we just have to assume the guest can always receive 
>> debug interrupts if it configured them.
> 
> No.
> 
> -Scott
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to