On 04.10.2012, at 17:19, Bhushan Bharat-R65777 wrote:
>
>
>>>>>>> -static int emulation_exit(struct kvm_run *run, struct kvm_vcpu
>>>>>>> *vcpu)
>>>>>>> +static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu,
>>>>>>> + int exit_nr)
>>>>>>> {
>>>>>>> enum emulation_result er;
>>>>>>>
>>>>>>> + if (unlikely(vcpu->guest_debug & KVM_GUESTDBG_USE_SW_BP) &&
>>>>>>> + vcpu->arch.last_inst == KVMPPC_INST_GUEST_GDB) {
>>>>>>
>>>>>> This belongs into the normal emulation code path, behind the same
>>>>>> switch() that everything else goes through.
>>>>>
>>>>> I am not sure I understood correctly. Below is the reason why I
>>>>> placed this
>>>> code here.
>>>>> Instruction where software breakpoint is to be set is replaced by "ehpriv"
>>>> instruction. On e500v2, this is not a valid instruction can causes
>>>> program interrupt. On e500mc, "ehpriv" is a valid instruction. Both
>>>> the exit path calls emulation_exit(), so we have placed the code in this
>> function.
>>>>> Do you want this code to be moved in program interrupt exit path for
>>>>> e500v2
>>>> and BOOKE_INTERRUPT_HV_PRIV for e500mc?
>>>>
>>>> Ok, in this patch you do (basically):
>>>>
>>>> int emulation_exit()
>>>> {
>>>> if (inst == DEBUG_INST) {
>>>> debug_stuff();
>>>> return;
>>>> }
>>>>
>>>> switch (inst) {
>>>> case INST_A:
>>>> foo();
>>>> ....
>>>> }
>>>> }
>>>
>>> Are not we doing something like this:
>>> int emulation_exit()
>>> {
>>> if (inst == DEBUG_INST) {
>>> debug_stuff();
>>> return;
>>> }
>>>
>>> status = kvmppc_emulate_instruction()
>>> switch (status) {
>>> case FAIL:
>>> foo();
>>> case DONE:
>>> foo1();
>>> ....
>>> }
>>> }
>>>
>>> Do you want something like this:
>>>
>>> int emulation_exit()
>>> {
>>>
>>> status = kvmppc_emulate_instruction()
>>> switch (status) {
>>> case FAIL:
>>> if (inst == DEBUG_INST) {
>>> debug_stuff();
>>> return;
>>> }
>>> foo();
>>>
>>> case DONE:
>>> foo1();
>>> ....
>>> }
>>> }
>>
>> No, I want the DEBUG_INST be handled the same as any other instruction we
>> emulate.
>
> I would like to understand what you are thinking:
> What I derived is , add the instruction in kvmppc_emulate_instruction() (or
> its child function) which,
> 1) fill the relevant information in run-> , kvmppc_account_exit(vcpu,
> DEBUG_EXITS); and returns EMULATION_DONE
> And in emulation_exit()
> status = kvmppc_emulate_instruction()
> switch (status) {
> case EMULATION_DONE:
> if (inst == DEBUG)
> return RESUME_HOST;
> }
> Or
> 2) kvmppc_account_exit(vcpu, DEBUG_EXITS); returns EMULATION_DONE;
> And in emulation_exit()
> status = kvmppc_emulate_instruction()
> switch (status) {
> case EMULATION_DONE:
> if (inst == DEBUG) {
> fill run->
> return RESUME_HOST;
> }
> }
>
> Or
> 3) kvmppc_account_exit(vcpu, DEBUG_EXITS); returns a new status type
> (EMULATION_DEBUG_INST)
> And in emulation_exit()
> status = kvmppc_emulate_instruction()
> switch (status) {
> case EMULATION_DEBUG_INST:
> fill run->
> return RESUME_HOST;
> }
This one :).
>
>>
>>>>
>>>> what I want is:
>>>>
>>>> int emulation_exit()
>>>> {
>>>> switch (inst) {
>>>> case INST_A:
>>>> foo(); break;
>>>> case DEBUG_INST:
>>>> debug_stuff(); break;
>>>> ....
>>>> }
>>>> }
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>> + run->exit_reason = KVM_EXIT_DEBUG;
>>>>>>> + run->debug.arch.pc = vcpu->arch.pc;
>>>>>>> + run->debug.arch.exception = exit_nr;
>>>>>>> + run->debug.arch.status = 0;
>>>>>>> + kvmppc_account_exit(vcpu, DEBUG_EXITS);
>>>>>>> + return RESUME_HOST;
>>>>>>> + }
>>>>>>> +
>>>>>>> er = kvmppc_emulate_instruction(run, vcpu);
>>>>>>> switch (er) {
>>>>>>> case EMULATE_DONE:
>>>>>>> @@ -697,6 +711,44 @@ static int emulation_exit(struct kvm_run
>>>>>>> *run, struct
>>>>>> kvm_vcpu *vcpu)
>>>>>>> default:
>>>>>>> BUG();
>>>>>>> }
>>>>>>> +
>>>>>>> + if (unlikely(vcpu->guest_debug & KVM_GUESTDBG_ENABLE) &&
>>>>>>> + (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
>>>>>>
>>>>>> I don't understand how this is supposed to work. When we enable
>>>>>> singlestep, why would we end up in emulation_exit()?
>>>>>
>>>>> When singlestep is enabled then we set DBCR0[ICMP] and the debug
>>>>> handler
>>>> should be able to handle this. I think you are right.
>>>>>
>>>>>>
>>>>>>> + run->exit_reason = KVM_EXIT_DEBUG;
>>>>>>> + return RESUME_HOST;
>>>>>>> + }
>>>>>>> +}
>>>>>>> +
>>>>>>> +static int kvmppc_handle_debug(struct kvm_run *run, struct
>>>>>>> +kvm_vcpu
>>>>>>> +*vcpu) {
>>>>>>> + u32 dbsr;
>>>>>>> +
>>>>>>> +#ifndef CONFIG_KVM_BOOKE_HV
>>>>>>> + if (cpu_has_feature(CPU_FTR_DEBUG_LVL_EXC))
>>>>>>> + vcpu->arch.pc = mfspr(SPRN_DSRR0);
>>>>>>> + else
>>>>>>> + vcpu->arch.pc = mfspr(SPRN_CSRR0); #endif
>>>>>>
>>>>>> Why doesn't this get handled in the asm code that recovers from the
>>>>>> respective exceptions?
>>>>>
>>>>> Yes. I will remove this.
>>>>>
>>>>>>
>>>>>>> + dbsr = vcpu->arch.dbsr;
>>>>>>> +
>>>>>>> + run->debug.arch.pc = vcpu->arch.pc;
>>>>>>> + run->debug.arch.status = 0;
>>>>>>> + vcpu->arch.dbsr = 0;
>>>>>>> +
>>>>>>> + if (dbsr & (DBSR_IAC1 | DBSR_IAC2 | DBSR_IAC3 | DBSR_IAC4)) {
>>>>>>> + run->debug.arch.status |= KVMPPC_DEBUG_BREAKPOINT;
>>>>>>> + } else {
>>>>>>> + if (dbsr & (DBSR_DAC1W | DBSR_DAC2W))
>>>>>>> + run->debug.arch.status |=
>>>>>>> KVMPPC_DEBUG_WATCH_WRITE;
>>>>>>> + else if (dbsr & (DBSR_DAC1R | DBSR_DAC2R))
>>>>>>> + run->debug.arch.status |=
>>>>>>> KVMPPC_DEBUG_WATCH_READ;
>>>>>>> + if (dbsr & (DBSR_DAC1R | DBSR_DAC1W))
>>>>>>> + run->debug.arch.pc =
>>>>>>> vcpu->arch.shadow_dbg_reg.dac[0];
>>>>>>> + else if (dbsr & (DBSR_DAC2R | DBSR_DAC2W))
>>>>>>> + run->debug.arch.pc =
>>>>>>> vcpu->arch.shadow_dbg_reg.dac[1];
>>>>>>> + }
>>>>>>
>>>>>> Mind to explain the guest register sharing mechanism you are
>>>>>> envisioning? Which registers are used by the guest? Which are used
>>>>>> by the host? Who decides what gets used by whom?
>>>>>
>>>>> struct kvm_vcpu_arch {
>>>>> .
>>>>> .
>>>>>
>>>>> /* guest debug registers*/
>>>>> struct kvmppc_booke_debug_reg dbg_reg;
>>>>> /* shadow debug registers */
>>>>> struct kvmppc_booke_debug_reg shadow_dbg_reg;
>>>>> /* host debug registers*/
>>>>> struct kvmppc_booke_debug_reg host_dbg_reg;
>>>>>
>>>>> .
>>>>> .
>>>>> }
>>>>>
>>>>> dbg_reg: contains the values written by guest.
>>>>> shadow_dbg_reg: This contains the actual values to be written on
>>>>> behalf of
>>>> guest/qemu.
>>>>> host_dbg_reg: the debug register value of host (needed for
>>>>> store/retore
>>>> purpose).
>>>>>
>>>>> According to current design both, the qemu debug stub and guest,
>>>>> cannot share
>>>> the debug resources. QEMU debug stub is given priority.
>>>>> But host and guest can share all debug registers and host cannot
>>>>> debug guest
>>>> if it is using the debug resource (which probably does not look like
>>>> a good case).
>>>>
>>>> Ok. Assume that in guest context, we only ever want guest debugging
>>>> enabled. If QEMU wants to inject a QEMU breakpoint, it will fiddle
>>>> with the guest debug registers itself to inject its breakpoint into them.
>>>
>>> I would like to understand the context you are describing.
>>> QEMU would like to inject debug interrupt to guest only if a debug interrupt
>> happened in guest context and QEMU is not able to handle interrupt (because
>> qemu
>> debug have not set). Right?
>>
>> Yes
>>
>>> In that case the only thing that needed to be updated is guest->DBSR?
>>
>> I think you lost me here :)
>
> Really :) . May be I wrote something wrong.
>
>>
>>>
>>>>
>>>> With that scheme, would we still need shadow debug registers? Or
>>>> could we remove that one level of indirection?
>>>>
>>>>>
>>>>>>
>>>>>>> +
>>>>>>> + return RESUME_HOST;
>>>>>>> }
>>>>>>>
>>>>>>> static void kvmppc_fill_pt_regs(struct pt_regs *regs) @@ -843,7
>>>>>>> +895,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct
>>>>>> kvm_vcpu *vcpu,
>>>>>>> break;
>>>>>>>
>>>>>>> case BOOKE_INTERRUPT_HV_PRIV:
>>>>>>> - r = emulation_exit(run, vcpu);
>>>>>>> + r = emulation_exit(run, vcpu, exit_nr);
>>>>>>> break;
>>>>>>>
>>>>>>> case BOOKE_INTERRUPT_PROGRAM:
>>>>>>> @@ -862,7 +914,7 @@ int kvmppc_handle_exit(struct kvm_run *run,
>>>>>>> struct
>>>>>> kvm_vcpu *vcpu,
>>>>>>> break;
>>>>>>> }
>>>>>>>
>>>>>>> - r = emulation_exit(run, vcpu);
>>>>>>> + r = emulation_exit(run, vcpu, exit_nr);
>>>>>>> break;
>>>>>>>
>>>>>>> case BOOKE_INTERRUPT_FP_UNAVAIL:
>>>>>>> @@ -1052,18 +1104,12 @@ int kvmppc_handle_exit(struct kvm_run
>>>>>>> *run, struct
>>>>>> kvm_vcpu *vcpu,
>>>>>>> }
>>>>>>>
>>>>>>> case BOOKE_INTERRUPT_DEBUG: {
>>>>>>> - u32 dbsr;
>>>>>>> -
>>>>>>> - vcpu->arch.pc = mfspr(SPRN_CSRR0);
>>>>>>> -
>>>>>>> - /* clear IAC events in DBSR register */
>>>>>>> - dbsr = mfspr(SPRN_DBSR);
>>>>>>> - dbsr &= DBSR_IAC1 | DBSR_IAC2 | DBSR_IAC3 | DBSR_IAC4;
>>>>>>> - mtspr(SPRN_DBSR, dbsr);
>>>>>>> -
>>>>>>> - run->exit_reason = KVM_EXIT_DEBUG;
>>>>>>> + r = kvmppc_handle_debug(run, vcpu);
>>>>>>> + if (r == RESUME_HOST) {
>>>>>>> + run->debug.arch.exception = exit_nr;
>>>>>>> + run->exit_reason = KVM_EXIT_DEBUG;
>>>>>>> + }
>>>>>>> kvmppc_account_exit(vcpu, DEBUG_EXITS);
>>>>>>> - r = RESUME_HOST;
>>>>>>> break;
>>>>>>> }
>>>>>>>
>>>>>>> @@ -1403,10 +1449,78 @@ int kvm_vcpu_ioctl_set_one_reg(struct
>>>>>>> kvm_vcpu *vcpu,
>>>>>> struct kvm_one_reg *reg)
>>>>>>> return r;
>>>>>>> }
>>>>>>>
>>>>>>> +#define BP_NUM KVMPPC_BOOKE_IAC_NUM
>>>>>>> +#define WP_NUM KVMPPC_BOOKE_DAC_NUM
>>>>>>> +
>>>>>>> int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>>>>>>> struct kvm_guest_debug *dbg) {
>>>>>>> - return -EINVAL;
>>>>>>> +
>>>>>>> + if (!(dbg->control & KVM_GUESTDBG_ENABLE)) {
>>>>>>> + /* Clear All debug events */
>>>>>>> + vcpu->arch.shadow_dbg_reg.dbcr0 = 0;
>>>>>>> + vcpu->guest_debug = 0;
>>>>>>> + return 0;
>>>>>>> + }
>>>>>>> +
>>>>>>> + vcpu->guest_debug = dbg->control;
>>>>>>> + vcpu->arch.shadow_dbg_reg.dbcr0 = 0;
>>>>>>> +
>>>>>>> + if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
>>>>>>> + vcpu->arch.shadow_dbg_reg.dbcr0 |= DBCR0_IDM | DBCR0_IC;
>>>>>>> +
>>>>>>> + if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP) {
>>>>>>> + struct kvmppc_booke_debug_reg *gdbgr =
>>>>>>> + &(vcpu->arch.shadow_dbg_reg);
>>>>>>> + int n, b = 0, w = 0;
>>>>>>> + const u32 bp_code[] = {
>>>>>>> + DBCR0_IAC1 | DBCR0_IDM,
>>>>>>> + DBCR0_IAC2 | DBCR0_IDM,
>>>>>>> + DBCR0_IAC3 | DBCR0_IDM,
>>>>>>> + DBCR0_IAC4 | DBCR0_IDM
>>>>>>> + };
>>>>>>> + const u32 wp_code[] = {
>>>>>>> + DBCR0_DAC1W | DBCR0_IDM,
>>>>>>> + DBCR0_DAC2W | DBCR0_IDM,
>>>>>>> + DBCR0_DAC1R | DBCR0_IDM,
>>>>>>> + DBCR0_DAC2R | DBCR0_IDM
>>>>>>> + };
>>>>>>> +
>>>>>>> +#ifndef CONFIG_KVM_BOOKE_HV
>>>>>>> + gdbgr->dbcr1 = DBCR1_IAC1US | DBCR1_IAC2US |
>>>>>>> + DBCR1_IAC3US | DBCR1_IAC4US;
>>>>>>> + gdbgr->dbcr2 = DBCR2_DAC1US | DBCR2_DAC2US; #else
>>>>>>> + gdbgr->dbcr1 = 0;
>>>>>>> + gdbgr->dbcr2 = 0;
>>>>>>> +#endif
>>>>>>> +
>>>>>>> + for (n = 0; n < (BP_NUM + WP_NUM); n++) {
>>>>>>> + u32 type = dbg->arch.bp[n].type;
>>>>>>> +
>>>>>>> + if (!type)
>>>>>>> + break;
>>>>>>> +
>>>>>>> + if (type & (KVMPPC_DEBUG_WATCH_READ |
>>>>>>> + KVMPPC_DEBUG_WATCH_WRITE)) {
>>>>>>> + if (w < WP_NUM) {
>>>>>>> + if (type &
>>>>>>> KVMPPC_DEBUG_WATCH_READ)
>>>>>>> + gdbgr->dbcr0 |=
>>>>>>> wp_code[w + 2];
>>>>>>> + if (type &
>>>>>>> KVMPPC_DEBUG_WATCH_WRITE)
>>>>>>> + gdbgr->dbcr0 |=
>>>>>>> wp_code[w];
>>>>>>> + gdbgr->dac[w] =
>>>>>>> dbg->arch.bp[n].addr;
>>>>>>> + w++;
>>>>>>> + }
>>>>>>> + } else if (type & KVMPPC_DEBUG_BREAKPOINT) {
>>>>>>> + if (b < BP_NUM) {
>>>>>>> + gdbgr->dbcr0 |= bp_code[b];
>>>>>>> + gdbgr->iac[b] =
>>>>>>> dbg->arch.bp[n].addr;
>>>>>>> + b++;
>>>>>>> + }
>>>>>>> + }
>>>>>>> + }
>>>>>>> + }
>>>>>>> + return 0;
>>>>>>> }
>>>>>>>
>>>>>>> int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct
>>>>>>> kvm_fpu *fpu) diff --git a/arch/powerpc/kvm/booke_interrupts.S
>>>>>> b/arch/powerpc/kvm/booke_interrupts.S
>>>>>>> index dd9c5d4..2202936 100644
>>>>>>> --- a/arch/powerpc/kvm/booke_interrupts.S
>>>>>>> +++ b/arch/powerpc/kvm/booke_interrupts.S
>>>>>>> @@ -39,6 +39,8 @@
>>>>>>> #define HOST_MIN_STACK_SIZE (HOST_NV_GPR(R31) + 4) #define
>>>>>>> HOST_STACK_SIZE (((HOST_MIN_STACK_SIZE + 15) / 16) * 16) /* Align. */
>>>>>>> #define HOST_STACK_LR (HOST_STACK_SIZE + 4) /* In caller stack frame.
>>>>>>> */
>>>>>>> +#define DBCR0_AC_BITS (DBCR0_IAC1 | DBCR0_IAC2 | DBCR0_IAC3 |
>> DBCR0_IAC4 | \
>>>>>>> + DBCR0_DAC1R | DBCR0_DAC1W | DBCR0_DAC2R |
>> DBCR0_DAC2W)
>>>>>>>
>>>>>>> #define NEED_INST_MASK ((1<<BOOKE_INTERRUPT_PROGRAM) | \
>>>>>>> (1<<BOOKE_INTERRUPT_DTLB_MISS) | \ @@ -52,6
>>>>>>> +54,8 @@
>>>>>>> (1<<BOOKE_INTERRUPT_PROGRAM) | \
>>>>>>> (1<<BOOKE_INTERRUPT_DTLB_MISS))
>>>>>>>
>>>>>>> +#define NEED_DEBUG_SAVE (1<<BOOKE_INTERRUPT_DEBUG)
>>>>>>> +
>>>>>>> .macro __KVM_HANDLER ivor_nr scratch srr0
>>>>>>> stw r3, VCPU_GPR(R3)(r4)
>>>>>>> stw r5, VCPU_GPR(R5)(r4)
>>>>>>> @@ -212,6 +216,61 @@ _GLOBAL(kvmppc_resume_host)
>>>>>>> stw r9, VCPU_FAULT_ESR(r4)
>>>>>>> ..skip_esr:
>>>>>>>
>>>>>>> + addi r7, r4, VCPU_HOST_DBG
>>>>>>> + mfspr r9, SPRN_DBCR0
>>>>>>> + lwz r8, KVMPPC_DBG_DBCR0(r7)
>>>>>>> + andis. r9, r9, DBCR0_AC_BITS@h
>>>>>>> + beq skip_load_host_debug
>>>>>>> + li r9, 0
>>>>>>> + mtspr SPRN_DBCR0, r9 /* disable all debug event */
>>>>>>> + lwz r9, KVMPPC_DBG_DBCR1(r7)
>>>>>>> + mtspr SPRN_DBCR1, r9
>>>>>>> + lwz r9, KVMPPC_DBG_DBCR2(r7)
>>>>>>> + mtspr SPRN_DBCR2, r9
>>>>>>> + lwz r9, KVMPPC_DBG_IAC1+4(r7)
>>>>>>> + mtspr SPRN_IAC1, r9
>>>>>>> + lwz r9, KVMPPC_DBG_IAC2+4(r7)
>>>>>>> + mtspr SPRN_IAC2, r9
>>>>>>> +#if CONFIG_PPC_ADV_DEBUG_IACS > 2
>>>>>>> + lwz r9, KVMPPC_DBG_IAC3+4(r7)
>>>>>>> + mtspr SPRN_IAC3, r9
>>>>>>> + lwz r9, KVMPPC_DBG_IAC4+4(r7)
>>>>>>> + mtspr SPRN_IAC4, r9
>>>>>>> +#endif
>>>>>>> + lwz r9, KVMPPC_DBG_DAC1+4(r7)
>>>>>>> + mtspr SPRN_DAC1, r9
>>>>>>> + lwz r9, KVMPPC_DBG_DAC2+4(r7)
>>>>>>
>>>>>> Yikes. Please move this into a struct, similar to how the MSR
>>>>>> save/restore happens on x86.
>>>>>>
>>>>>> Also, these are 64-bit loads and stores, no?
>>>>>
>>>>> Ok, you mean PPC_LD/STD?
>>>>
>>>> Yeah :).
>>>>
>>>>>
>>>>>>
>>>>>>> + mtspr SPRN_DAC2, r9
>>>>>>> +skip_load_host_debug:
>>>>>>> + /* Clear h/w DBSR and save current(guest) DBSR */
>>>>>>> + mfspr r9, SPRN_DBSR
>>>>>>> + mtspr SPRN_DBSR, r9
>>>>>>> + isync
>>>>>>> + andi. r7, r6, NEED_DEBUG_SAVE
>>>>>>> + beq skip_dbsr_save
>>>>>>> + /*
>>>>>>> + * If vcpu->guest_debug flag is set then do not check for
>>>>>>> + * shared->msr.DE as this debugging (say by QEMU) does not
>>>>>>> + * depends on shared->msr.de. In these scanerios MSR.DE is
>>>>>>> + * always set using shared_msr and should be handled always.
>>>>>>> + */
>>>>>>> + lwz r7, VCPU_GUEST_DEBUG(r4)
>>>>>>> + cmpwi r7, 0
>>>>>>> + bne skip_save_trap_event
>>>>>>> + PPC_LL r3, VCPU_SHARED(r4)
>>>>>>> +#ifndef CONFIG_64BIT
>>>>>>> + lwz r3, (VCPU_SHARED_MSR + 4)(r3)
>>>>>>> +#else
>>>>>>> + ld r3, (VCPU_SHARED_MSR)(r3)
>>>>>>> +#endif
>>>>>>
>>>>>> Didn't we have 64-bit access helpers for these?
>>>>>
>>>>> I will use PPC_LD.
>>>>>
>>>>>>
>>>>>> Also this shouldn't happen in the entry/exit code. We have
>>>>>> shadow_msr for these purposes.
>>>>>
>>>>> Based on shared_msr.MSR_DE, we set make DBSR (set DBSR_TIE).
>>>>
>>>> Hrm. Yeah, must've misread something here.
>>>>
>>>> It would be awesome if we could make the debug register save/restore
>>>> a bit more relaxed though. For example during preemption.
>>>
>>> If we delay this save/restore then host cannot do debugging of KVM code.
>> Right?
>>
>> You mean using kdb? Does anyone actually do that?
>>
>> But let's leave this for later as an optimization.
>
> Do you mean that debug registers should be saved/restored in preemption? any
> specific reason? I think the current save/restore does not add much overhead
> and we can better it now only, is not it?
It definitely adds quite a bit of overhead in a very performance critical path
(instruction emulation for v2).
Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html