Re: [Ilw] iwlagn and kvm related "BUG: scheduling while atomic" after resuming

Gleb Natapov Tue, 19 Jul 2011 02:13:31 -0700

On Tue, Jul 19, 2011 at 12:04:21PM +0300, Avi Kivity wrote:
> On 07/19/2011 12:51 AM, Stefan Hajnoczi wrote:
> >On Mon, Jul 18, 2011 at 9:18 PM, Berg, Johannes<[email protected]>  
> >wrote:
> >>>  Today I encountered a "BUG: scheduling while atomic" from kvm.ko when
> >>>  resuming the host from suspend-to-RAM.  I captured as much of the oops as
> >>>  was displayed on screen:
> >>>
> >>>  http://vmsplice.net/~stefan/panic1.jpg
> >>>  http://vmsplice.net/~stefan/panic2.jpg
> >>>
> >>>  It looks like the iwlagn driver may have crashed in an interrupt handler 
> >>> and the
> >>>  kvm-related panic was triggered in the aftermath.  Any ideas?
> >>
> >>  This doesn't look like iwlagn is involved at all -- the fact that it 
> >> comes up in the backtrace seems to be an artifact of backtracing not being 
> >> perfect. The RIP points to kvm_arch_vcpu_ioctl_run+0x927 and there's no 
> >> reason to believe that iwlagn should crash kvm.
> >
> >RIP seems to be arch/x86/kvm/x86.c:vcpu_enter_guest():
> >
> >     preempt_disable();
> >
> >     kvm_x86_ops->prepare_guest_switch(vcpu);
> >     if (vcpu->fpu_active)
> >             kvm_load_guest_fpu(vcpu);
> >     kvm_load_guest_xcr0(vcpu);
> >
> >     vcpu->mode = IN_GUEST_MODE;
> >
> >     /* We should set ->mode before check ->requests,
> >      * see the comment in make_all_cpus_request.
> >      */
> >     smp_mb();
> >
> >     local_irq_disable();
> >
> >     if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests
> >         || need_resched() || signal_pending(current)) {
> >             vcpu->mode = OUTSIDE_GUEST_MODE;
> >             smp_wmb();
> >             local_irq_enable();
> >             preempt_enable();
> >             kvm_x86_ops->cancel_injection(vcpu);
> >             r = 1;
> >             goto out;
> >     }
> >
> >     srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
> >
> >     kvm_guest_enter();
> >
> >     if (unlikely(vcpu->arch.switch_db_regs)) {
> >             set_debugreg(0, 7);
> >             set_debugreg(vcpu->arch.eff_db[0], 0);
> >             set_debugreg(vcpu->arch.eff_db[1], 1);
> >             set_debugreg(vcpu->arch.eff_db[2], 2);
> >             set_debugreg(vcpu->arch.eff_db[3], 3);
> >     }
> >
> >     trace_kvm_entry(vcpu->vcpu_id);
> >     kvm_x86_ops->run(vcpu);
> >
> >     /*
> >      * If the guest has used debug registers, at least dr7
> >      * will be disabled while returning to the host.
> >      * If we don't have active breakpoints in the host, we don't
> >      * care about the messed up debug address registers. But if
> >      * we have some of them active, restore the old state.
> >      */
> >     if (hw_breakpoint_active())
> >             hw_breakpoint_restore();
> >
> >     kvm_get_msr(vcpu, MSR_IA32_TSC,&vcpu->arch.last_guest_tsc);
> >
> >     vcpu->mode = OUTSIDE_GUEST_MODE;
> >     smp_wmb();
> >     local_irq_enable();  /*<--- boom! */
> 
> Preemption is still disabled at this point.  Where does the
> "scheduling while atomic" come from?  Nothing in this area attempts
> to schedule.
> 
0x10000000 in preemption counter is PREEMPT_ACTIVE, so this looks like
preemptable kernel tries to preempt itself.


> The preemption counter is 0x10000100, indicating zero preempt depth
> (wrong for this point, should be 1), and 1 softirq depth (doesn't
> make much sense).  Looks very wrong, like the preempt mixup that
> occured on some archs that are not x86_64 recently.
> 
> Can you post some disassembly around %rip?
> 
> -- 
> error compiling committee.c: too many arguments to function
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
                        Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Ilw] iwlagn and kvm related "BUG: scheduling while atomic" after resuming

Reply via email to