Re: profiling the resume path
Hi Thomas, I found something that looks strange behaviour while profiling the resume-from-RAM path on OLPC. A little explanation on traces: they have been gathered using Arnaldo's refactoring of latencytrace patches From 3e81bb2749ab132f7ff68a22880754e206d7986a Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo [EMAIL PROTECTED] Date: Wed, 25 Jul 2007 15:12:22 -0300 Subject: [PATCH 1/9] [MCOUNT]: Add basic support for gcc profiler instrumentation If CONFIG_MCOUNT is selected and /proc/sys/kernel/mcount_enabled is set to a non-zero value the mcount routine will be called everytime we enter a kernel function that is not marked with the notrace attribute. Then it records that into a ring buffer. so, This no need resched! printk was added inside the while (!need_resched()) { void (*idle)(void); ... } loop of cpu_idle(). And I gathered two different traces of the exact same kernel. Trace #1: [0.059910] ohci_hcd :00:0f.4: powerup ports [0.060119] no need resched! [0.070230] no need resched! [0.097732] no need resched! [0.106032] tick_nohz_stop_sched_tick: next_jiffies = 4294947976, last_jiffies = 4294947974 jiffies = 4294947974 [0.106174] tick_nohz_stop_sched_tick: expires=[106,8] [0.106270] tick_nohz_stop_sched_tick: now=[106,786802612] [0.106369] no need resched! [0.119478] tick_nohz_stop_sched_tick: next_jiffies = 4294947978, last_jiffies = 4294947976 jiffies = 4294947976 [0.119618] tick_nohz_stop_sched_tick: expires=[106,82000] [0.119714] tick_nohz_stop_sched_tick: now=[106,800253059] [0.119813] no need resched! Then the resume path restarts at (this is from mctrace, I added another msleep right after msleep(20) for diagnosis). [0.119336] ohci_pci_resume+0x8c/0xa0 - msleep+0xb/0x27 Trace #2: Another run with the same kernel. [0.059903] ohci_hcd :00:0f.4: powerup ports [0.060108] no need resched! [0.070225] no need resched! [0.097726] no need resched! [0.125226] no need resched! [0.152728] no need resched! [0.180230] no need resched! [0.206887] tick_nohz_stop_sched_tick: next_jiffies = 4294942086, last_jiffies = 4294942084 jiffies = 4294942084 [0.207028] tick_nohz_stop_sched_tick: expires=[47,9] [0.207123] tick_nohz_stop_sched_tick: now=[47,886800917] Which restarts at [0.220191] ohci_pci_resume+0x8c/0xa0 - msleep+0xb/0x27 Sorry, but how the hell that can happen? Both runs look totally hosed since the msleep(20) call from ohci_pci_resume happens at 59ms, and we have an interrupt at 97ms (on both traces) which should get the system out of the idle loop. Full traces: http://dev.laptop.org/~marcelo/mctrace/trace1.txt http://dev.laptop.org/~marcelo/mctrace/trace2.txt Whoever sets TIF_NEED_RESCHED is not doing its job properly? ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: profiling the resume path
On Sun, Sep 02, 2007 at 01:03:12PM -0400, Marcelo Tosatti wrote: On Sun, Sep 02, 2007 at 12:50:03PM +0100, David Woodhouse wrote: On Sun, 2007-09-02 at 04:10 -0400, Marcelo Tosatti wrote: Note: ohci_pci_resume does msleep 20. Hm. It's just waiting for the hardware to settle, right? Do the resume functions for the devices themselves actually have to wait until this is complete, before they can do anything? It really sounds like we need to decouple suspend/resume of individual hardware devices from the full system suspend. It should be almost completely asynchronous. Why can't I start running userspace after resume, before I've reinitialised the USB thermometer which is plugged in? Why don't we just block access to that particular device until it's complete (and take that device off-line to save power even when the full system isn't suspended). You're entirely right, but I'm talking about something else: [0.040909] cpu_idle+0x2e/0x7c - check_pgt_cache+0xb/0x31 [0.00] check_pgt_cache+0x2f/0x31 - quicklist_trim+0xe/0xe5 [0.040912] cpu_idle+0x4e/0x7c - default_idle+0x8/0x43 [0.233593] do_IRQ+0x45/0xdb - irq_enter+0xb/0x2d [0.233594] irq_enter+0x22/0x2d - idle_cpu+0x8/0x1b This interrupt scheduled for 233-40ms is what sounds wrong. It should just continue to blaze off the EHCI resume path. ... after 20ms have passed, not almost 200. ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: profiling the resume path
On Mon, 2007-09-03 at 04:05 -0400, Marcelo Tosatti wrote: This interrupt scheduled for 233-40ms is what sounds wrong. It should just continue to blaze off the EHCI resume path. ... after 20ms have passed, not almost 200. Ah, right. Sorry, I missed the order of magnitude discrepancy. Are we just miscalculating the setting of the wakeup interrupt? More debugging of that calculation might be in order... -- dwmw2 ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: profiling the resume path
On Mon, 2007-09-03 at 09:27 +0100, David Woodhouse wrote: On Mon, 2007-09-03 at 04:05 -0400, Marcelo Tosatti wrote: This interrupt scheduled for 233-40ms is what sounds wrong. It should just continue to blaze off the EHCI resume path. ... after 20ms have passed, not almost 200. Ah, right. Sorry, I missed the order of magnitude discrepancy. Are we just miscalculating the setting of the wakeup interrupt? More debugging of that calculation might be in order... Can you comment out the 'if(tbase_get_deferrable(nte-base)) continue;' in __next_timer_interrupt() at about line 678 of kernel/timer.c ? -- dwmw2 ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: profiling the resume path
On Sun, 2007-09-02 at 04:10 -0400, Marcelo Tosatti wrote: Note: ohci_pci_resume does msleep 20. Hm. It's just waiting for the hardware to settle, right? Do the resume functions for the devices themselves actually have to wait until this is complete, before they can do anything? It really sounds like we need to decouple suspend/resume of individual hardware devices from the full system suspend. It should be almost completely asynchronous. Why can't I start running userspace after resume, before I've reinitialised the USB thermometer which is plugged in? Why don't we just block access to that particular device until it's complete (and take that device off-line to save power even when the full system isn't suspended). -- dwmw2 ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: profiling the resume path
On Sun, Sep 02, 2007 at 12:50:03PM +0100, David Woodhouse wrote: On Sun, 2007-09-02 at 04:10 -0400, Marcelo Tosatti wrote: Note: ohci_pci_resume does msleep 20. Hm. It's just waiting for the hardware to settle, right? Do the resume functions for the devices themselves actually have to wait until this is complete, before they can do anything? It really sounds like we need to decouple suspend/resume of individual hardware devices from the full system suspend. It should be almost completely asynchronous. Why can't I start running userspace after resume, before I've reinitialised the USB thermometer which is plugged in? Why don't we just block access to that particular device until it's complete (and take that device off-line to save power even when the full system isn't suspended). You're entirely right, but I'm talking about something else: [0.040909] cpu_idle+0x2e/0x7c - check_pgt_cache+0xb/0x31 [0.00] check_pgt_cache+0x2f/0x31 - quicklist_trim+0xe/0xe5 [0.040912] cpu_idle+0x4e/0x7c - default_idle+0x8/0x43 [0.233593] do_IRQ+0x45/0xdb - irq_enter+0xb/0x2d [0.233594] irq_enter+0x22/0x2d - idle_cpu+0x8/0x1b This interrupt scheduled for 233-40ms is what sounds wrong. It should just continue to blaze off the EHCI resume path. ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: profiling the resume path
On Wed, 2007-08-29 at 08:31 -0400, Marcelo Tosatti wrote: What you think is the easier/proper way to postpone this console work to happen after the resume process is finished? It's spending all its time waiting for characters to be sent out a serial port which isn't even going to have anything attached to it? Have you considered dropping 'console=ttyS0' from the kernel command line? :) -- dwmw2 ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel