On Fri, 2013-08-02 at 14:41 -0400, Alan Stern wrote:
> Steve:
> 
> Here's another irqsoff trace collected by James.  It's not as extreme 
> as before, but I don't know how to interpret the 875-us jump in the 
> time value.  Could this be a result of the timer being adjusted?
> 
> Also, what's the explanation for these two lines:
> 
> URL Clas-3286    2d...  881us : jiffies_to_timeval <-__acct_update_integrals
> URL Clas-3286    2d.h.  882us : exit_idle <-smp_apic_timer_interrupt
> 
> They appear to say that the CPU got a hardirq while interrupts were
> disabled.  How can that be?

If you have NMIs going off, that is probably what's causing issues.

> 
> Alan Stern
> 
> 
> # tracer: irqsoff
> #
> # irqsoff latency trace v1.1.5 on 3.10.0-ver5
> # --------------------------------------------------------------------
> # latency: 906 us, #93/93, CPU#2 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
> #    -----------------
> #    | task: URL Classifier-3286 (uid:1000 nice:0 policy:0 rt_prio:0)
> #    -----------------
> #  => started at: perf_event_update_userpage

I'm guessing that it enabled the tracing with perf in the NMI.
But NMI wont tell the tracer it ended :-(

> #  => ended at:   retint_swapgs
> #
> #
> #                  _------=> CPU#            
> #                 / _-----=> irqs-off        
> #                | / _----=> need-resched    
> #                || / _---=> hardirq/softirq 
> #                ||| / _--=> preempt-depth   
> #                |||| /     delay             
> #  cmd     pid   ||||| time  |   caller      
> #     \   /      |||||  \    |   /           
> URL Clas-3286    2d.h.    1us : local_clock <-perf_event_update_userpage
> URL Clas-3286    2d.h.    2us : watchdog_overflow_callback 
> <-__perf_event_overflow
> URL Clas-3286    2d.h.    3us : arch_trigger_all_cpu_backtrace_handler 
> <-nmi_handle.isra.0
> URL Clas-3286    2d.h.    3us : perf_ibs_nmi_handler <-nmi_handle.isra.0
> URL Clas-3286    2d.h.    3us : perf_ibs_handle_irq <-perf_ibs_nmi_handler
> URL Clas-3286    2d.h.    4us : perf_ibs_handle_irq <-perf_ibs_nmi_handler
> URL Clas-3286    2d.h.    4us!: rcu_nmi_exit <-do_nmi

The tracer started within the NMI, but out of sheer (bad) luck. That's
because the NMI code has no logic to handle tracing interrupts on or
off, due to the problems they cause. We may at most be able to just
ignore all NMIs by adding a 'in_nmi()' check. I can send you a patch if
you want.

The other thing you can do is not run perf while this is going on.


> URL Clas-3286    2d...  879us : smp_apic_timer_interrupt 
> <-apic_timer_interrupt

Yes, we can have interrupts happen here. We may have even went into
userspace, as the return from NMI never told the tracer that we are
enabling interrupts again. Thus, the enable interrupt is what was
missed.

-- Steve

> URL Clas-3286    2d...  879us : irq_enter <-smp_apic_timer_interrupt
> URL Clas-3286    2d...  879us : rcu_irq_enter <-irq_enter
> URL Clas-3286    2d...  879us : rcu_eqs_exit_common.isra.45 <-rcu_irq_enter
> URL Clas-3286    2d...  880us : vtime_account_irq_enter <-irq_enter
> URL Clas-3286    2d...  880us : vtime_account_user <-vtime_account_irq_enter
> URL Clas-3286    2d...  880us : get_vtime_delta <-vtime_account_user
> URL Clas-3286    2d...  880us : _raw_spin_lock <-vtime_account_user


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to