> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Tuesday, July 29, 2014 3:58 AM
> To: Bhushan Bharat-R65777
> Cc: [email protected]; [email protected]; [email protected]; Yoder
> Stuart-
> B08248
> Subject: Re: [PATCH 6/6] KVM: PPC: BOOKE: Emulate debug registers and
> exception
>
> On Fri, 2014-07-11 at 14:09 +0530, Bharat Bhushan wrote:
> > This patch emulates debug registers and debug exception
> > to support guest using debug resource. This enables running
> > gdb/kgdb etc in guest.
> >
> > On BOOKE architecture we cannot share debug resources between QEMU and
> > guest because:
> > When QEMU is using debug resources then debug exception must
> > be always enabled. To achieve this we set MSR_DE and also set
> > MSRP_DEP so guest cannot change MSR_DE.
> >
> > When emulating debug resource for guest we want guest
> > to control MSR_DE (enable/disable debug interrupt on need).
> >
> > So above mentioned two configuration cannot be supported
> > at the same time. So the result is that we cannot share
> > debug resources between QEMU and Guest on BOOKE architecture.
> >
> > In the current design QEMU gets priority over guest, this means that if
> > QEMU is using debug resources then guest cannot use them and if guest is
> > using debug resource then QEMU can overwrite them.
> >
> > Signed-off-by: Bharat Bhushan <[email protected]>
> > ---
> > Hi Alex,
> >
> > I thought of having some print in register emulation if QEMU
> > is using debug resource, Also when QEMU overwrites guest written
> > values but that looks excessive. If I uses some variable which
> > get set when guest starts using debug registers and check in
> > debug set ioctl then that look ugly. Looking for suggestions
> >
> > arch/powerpc/include/asm/kvm_ppc.h | 3 +
> > arch/powerpc/kvm/booke.c | 27 +++++++
> > arch/powerpc/kvm/booke_emulate.c | 157
> +++++++++++++++++++++++++++++++++++++
> > 3 files changed, 187 insertions(+)
> >
> > diff --git a/arch/powerpc/include/asm/kvm_ppc.h
> b/arch/powerpc/include/asm/kvm_ppc.h
> > index e2fd5a1..f3f7611 100644
> > --- a/arch/powerpc/include/asm/kvm_ppc.h
> > +++ b/arch/powerpc/include/asm/kvm_ppc.h
> > @@ -173,6 +173,9 @@ extern int kvmppc_xics_get_xive(struct kvm *kvm, u32
> > irq,
> u32 *server,
> > extern int kvmppc_xics_int_on(struct kvm *kvm, u32 irq);
> > extern int kvmppc_xics_int_off(struct kvm *kvm, u32 irq);
> >
> > +void kvmppc_core_dequeue_debug(struct kvm_vcpu *vcpu);
> > +void kvmppc_core_queue_debug(struct kvm_vcpu *vcpu);
> > +
> > union kvmppc_one_reg {
> > u32 wval;
> > u64 dval;
> > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> > index fadfe76..c2471ed 100644
> > --- a/arch/powerpc/kvm/booke.c
> > +++ b/arch/powerpc/kvm/booke.c
> > @@ -264,6 +264,16 @@ static void kvmppc_core_dequeue_watchdog(struct
> > kvm_vcpu
> *vcpu)
> > clear_bit(BOOKE_IRQPRIO_WATCHDOG, &vcpu->arch.pending_exceptions);
> > }
> >
> > +void kvmppc_core_queue_debug(struct kvm_vcpu *vcpu)
> > +{
> > + kvmppc_booke_queue_irqprio(vcpu, BOOKE_IRQPRIO_DEBUG);
> > +}
> > +
> > +void kvmppc_core_dequeue_debug(struct kvm_vcpu *vcpu)
> > +{
> > + clear_bit(BOOKE_IRQPRIO_DEBUG, &vcpu->arch.pending_exceptions);
> > +}
> > +
> > static void set_guest_srr(struct kvm_vcpu *vcpu, unsigned long srr0, u32
> srr1)
> > {
> > #ifdef CONFIG_KVM_BOOKE_HV
> > @@ -783,6 +793,23 @@ static int kvmppc_handle_debug(struct kvm_run *run,
> struct kvm_vcpu *vcpu)
> > struct debug_reg *dbg_reg = &(vcpu->arch.shadow_dbg_reg);
> > u32 dbsr = vcpu->arch.dbsr;
> >
> > + if (vcpu->guest_debug == 0) {
> > + /* Debug resources belong to Guest */
> > + if (dbsr && (vcpu->arch.shared->msr & MSR_DE))
> > + kvmppc_core_queue_debug(vcpu);
>
> Should also check for DBCR0[IDM].
Ok
>
> > +
> > + /* Inject a program interrupt if trap debug is not allowed */
> > + if ((dbsr & DBSR_TIE) && !(vcpu->arch.shared->msr & MSR_DE))
> > + kvmppc_core_queue_program(vcpu, ESR_PTR);
> > +
> > + return RESUME_GUEST;
> > + }
> > +
> > + /*
> > + * Prepare for userspace exit as debug resources
> > + * are owned by userspace.
> > + */
> > + vcpu->arch.dbsr = 0;
> > run->debug.arch.status = 0;
> > run->debug.arch.address = vcpu->arch.pc;
>
> Why are you clearing vcpu->arch.dbsr?
It was discussed in some other email, as soon as we decide that the debug
interrupt is handled by QEMU then we clear vcpu->arch->dbsr.
- QEMU cannot inject debug interrupt to guest (as this does not know guest
ability to handle debug interrupt; MSR_DE), so will always clear DBSR.
- If QEMU has to always clear DBSR than in handling KVM_EXIT_DEBUG then this
avoid doing in SET_SREGS
> Userspace might be interested in
> the raw value,
With the current design, If userspace is interested then it will not get the
DBSR. But why userspace will be interested?
> plus it's a change from the current API semantics.
Can you please let us know how ?
>
> Where is this run->debug.arch.<foo> stuff and KVM_EXIT_DEBUG documented?
I do not see anything in Documentation/. While there is some not in
arch/powerpc/include/uapi/asm/kvm.h.
(Note: this is not something new exit type added by this patch).
>
>
> > diff --git a/arch/powerpc/kvm/booke_emulate.c
> b/arch/powerpc/kvm/booke_emulate.c
> > index 2a20194..3d143fe 100644
> > --- a/arch/powerpc/kvm/booke_emulate.c
> > +++ b/arch/powerpc/kvm/booke_emulate.c
> > @@ -139,6 +139,7 @@ int kvmppc_booke_emulate_op(struct kvm_run *run, struct
> kvm_vcpu *vcpu,
> > int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, ulong
> spr_val)
> > {
> > int emulated = EMULATE_DONE;
> > + bool debug_inst = false;
> >
> > switch (sprn) {
> > case SPRN_DEAR:
> > @@ -153,14 +154,137 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu,
> int sprn, ulong spr_val)
> > case SPRN_CSRR1:
> > vcpu->arch.csrr1 = spr_val;
> > break;
> > + case SPRN_DSRR0:
> > + vcpu->arch.dsrr0 = spr_val;
> > + break;
> > + case SPRN_DSRR1:
> > + vcpu->arch.dsrr1 = spr_val;
> > + break;
> > + case SPRN_IAC1:
> > + /*
> > + * If userspace is debugging guest then guest
> > + * can not access debug registers.
> > + */
> > + if (vcpu->guest_debug)
> > + break;
> > +
> > + debug_inst = true;
> > + vcpu->arch.dbg_reg.iac1 = spr_val;
> > + vcpu->arch.shadow_dbg_reg.iac1 = spr_val;
> > + break;
> > + case SPRN_IAC2:
> > + /*
> > + * If userspace is debugging guest then guest
> > + * can not access debug registers.
> > + */
> > + if (vcpu->guest_debug)
> > + break;
> > +
> > + debug_inst = true;
> > + vcpu->arch.dbg_reg.iac2 = spr_val;
> > + vcpu->arch.shadow_dbg_reg.iac2 = spr_val;
> > + break;
> > +#if CONFIG_PPC_ADV_DEBUG_IACS > 2
> > + case SPRN_IAC3:
> > + /*
> > + * If userspace is debugging guest then guest
> > + * can not access debug registers.
> > + */
> > + if (vcpu->guest_debug)
> > + break;
> > +
> > + debug_inst = true;
> > + vcpu->arch.dbg_reg.iac3 = spr_val;
> > + vcpu->arch.shadow_dbg_reg.iac3 = spr_val;
> > + break;
> > + case SPRN_IAC4:
> > + /*
> > + * If userspace is debugging guest then guest
> > + * can not access debug registers.
> > + */
> > + if (vcpu->guest_debug)
> > + break;
> > +
> > + debug_inst = true;
> > + vcpu->arch.dbg_reg.iac4 = spr_val;
> > + vcpu->arch.shadow_dbg_reg.iac4 = spr_val;
> > + break;
> > +#endif
> > + case SPRN_DAC1:
> > + /*
> > + * If userspace is debugging guest then guest
> > + * can not access debug registers.
> > + */
> > + if (vcpu->guest_debug)
> > + break;
> > +
> > + debug_inst = true;
> > + vcpu->arch.dbg_reg.dac1 = spr_val;
> > + vcpu->arch.shadow_dbg_reg.dac1 = spr_val;
> > + break;
> > + case SPRN_DAC2:
> > + /*
> > + * If userspace is debugging guest then guest
> > + * can not access debug registers.
> > + */
> > + if (vcpu->guest_debug)
> > + break;
> > +
> > + debug_inst = true;
> > + vcpu->arch.dbg_reg.dac2 = spr_val;
> > + vcpu->arch.shadow_dbg_reg.dac2 = spr_val;
> > + break;
> > case SPRN_DBCR0:
> > + /*
> > + * If userspace is debugging guest then guest
> > + * can not access debug registers.
> > + */
> > + if (vcpu->guest_debug)
> > + break;
> > +
> > + debug_inst = true;
> > + spr_val &= (DBCR0_IDM | DBCR0_IC | DBCR0_BT | DBCR0_TIE |
> > + DBCR0_IAC1 | DBCR0_IAC2 | DBCR0_IAC3 | DBCR0_IAC4 |
> > + DBCR0_DAC1R | DBCR0_DAC1W | DBCR0_DAC2R | DBCR0_DAC2W);
> > +
> > vcpu->arch.dbg_reg.dbcr0 = spr_val;
> > + vcpu->arch.shadow_dbg_reg.dbcr0 = spr_val;
> > break;
> > case SPRN_DBCR1:
> > + /*
> > + * If userspace is debugging guest then guest
> > + * can not access debug registers.
> > + */
> > + if (vcpu->guest_debug)
> > + break;
> > +
> > + debug_inst = true;
> > vcpu->arch.dbg_reg.dbcr1 = spr_val;
> > + vcpu->arch.shadow_dbg_reg.dbcr1 = spr_val;
> > + break;
> > + case SPRN_DBCR2:
> > + /*
> > + * If userspace is debugging guest then guest
> > + * can not access debug registers.
> > + */
> > + if (vcpu->guest_debug)
> > + break;
> > +
> > + debug_inst = true;
> > + vcpu->arch.dbg_reg.dbcr2 = spr_val;
> > + vcpu->arch.shadow_dbg_reg.dbcr2 = spr_val;
> > break;
>
> In what circumstances can the architected and shadow registers differ?
As of now they are same. But I think that if we want to implement other
features like "Freeze Timer (FT)" then they can be different.
>
> > case SPRN_DBSR:
> > + /*
> > + * If userspace is debugging guest then guest
> > + * can not access debug registers.
> > + */
> > + if (vcpu->guest_debug)
> > + break;
> > +
> > vcpu->arch.dbsr &= ~spr_val;
> > + if (vcpu->arch.dbsr == 0)
> > + kvmppc_core_dequeue_debug(vcpu);
> > break;
>
> Not all DBSR bits cause an exception, e.g. IDE and MRR.
I am not sure what we should in that case ?
As we are currently emulating a subset of debug events (IAC, DAC, IC, BT and
TIE --- DBCR0 emulation) then we should expose status of those events in guest
dbsr and rest should be cleared ?
>
> > case SPRN_TSR:
> > kvmppc_clr_tsr_bits(vcpu, spr_val);
> > @@ -273,6 +397,10 @@ int kvmppc_booke_emulate_mtspr(struct kvm_vcpu *vcpu,
> > int
> sprn, ulong spr_val)
> > emulated = EMULATE_FAIL;
> > }
> >
> > + if (debug_inst) {
> > + switch_booke_debug_regs(&vcpu->arch.shadow_dbg_reg);
> > + current->thread.debug = vcpu->arch.shadow_dbg_reg;
> > + }
>
> Could you explain what's going on with regard to copying the registers
> into current->thread.debug? Why is it done after loading the registers
> into the hardware (is there a race if we get preempted in the middle)?
Yes, and this was something I was not clear when writing this code. Should we
have preempt disable-enable around this.
Thanks
-Bharat
>
> -Scott
> ;
N�����r��y����b�X��ǧv�^�){.n�+����jir)����w*jg��������ݢj/���z�ޖ��2�ޙ����&�)ߡ�a�����G���h��j:+v���w��٥