Re: profiling the resume path

2007-09-06 Thread Marcelo Tosatti
Hi Thomas,

I found something that looks strange behaviour while profiling the
resume-from-RAM path on OLPC.

A little explanation on traces: they have been gathered using Arnaldo's
refactoring of latencytrace patches 

From 3e81bb2749ab132f7ff68a22880754e206d7986a Mon Sep 17 00:00:00 2001
From: Arnaldo Carvalho de Melo [EMAIL PROTECTED]
Date: Wed, 25 Jul 2007 15:12:22 -0300
Subject: [PATCH 1/9] [MCOUNT]: Add basic support for gcc profiler
instrumentation

If CONFIG_MCOUNT is selected and /proc/sys/kernel/mcount_enabled is set to a
non-zero value the mcount routine will be called everytime we enter a kernel
function that is not marked with the notrace attribute.

Then it records that into a ring buffer.

so, 

This no need resched! printk was added inside the 

while (!need_resched()) {
void (*idle)(void);
...
}

loop of cpu_idle().

And I gathered two different traces of the exact same kernel.

Trace #1:

[0.059910] ohci_hcd :00:0f.4: powerup ports
[0.060119] no need resched!
[0.070230] no need resched!
[0.097732] no need resched!
[0.106032] tick_nohz_stop_sched_tick: next_jiffies = 4294947976, 
last_jiffies = 4294947974 jiffies = 4294947974
[0.106174] tick_nohz_stop_sched_tick: expires=[106,8]
[0.106270] tick_nohz_stop_sched_tick: now=[106,786802612]
[0.106369] no need resched!
[0.119478] tick_nohz_stop_sched_tick: next_jiffies = 4294947978, 
last_jiffies = 4294947976 jiffies = 4294947976
[0.119618] tick_nohz_stop_sched_tick: expires=[106,82000]
[0.119714] tick_nohz_stop_sched_tick: now=[106,800253059]
[0.119813] no need resched!

Then the resume path restarts at (this is from mctrace, I added another
msleep right after msleep(20) for diagnosis).

[0.119336] ohci_pci_resume+0x8c/0xa0 - msleep+0xb/0x27

Trace #2:

Another run with the same kernel.

[0.059903] ohci_hcd :00:0f.4: powerup ports
[0.060108] no need resched!
[0.070225] no need resched!
[0.097726] no need resched!
[0.125226] no need resched!
[0.152728] no need resched!
[0.180230] no need resched!
[0.206887] tick_nohz_stop_sched_tick: next_jiffies = 4294942086,
last_jiffies = 4294942084 jiffies = 4294942084
[0.207028] tick_nohz_stop_sched_tick: expires=[47,9]
[0.207123] tick_nohz_stop_sched_tick: now=[47,886800917]

Which restarts at

[0.220191] ohci_pci_resume+0x8c/0xa0 - msleep+0xb/0x27

Sorry, but how the hell that can happen? 

Both runs look totally hosed since the msleep(20) call from
ohci_pci_resume happens at 59ms, and we have an interrupt at 97ms (on
both traces) which should get the system out of the idle loop.

Full traces:
http://dev.laptop.org/~marcelo/mctrace/trace1.txt
http://dev.laptop.org/~marcelo/mctrace/trace2.txt

Whoever sets TIF_NEED_RESCHED is not doing its job properly?
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: profiling the resume path

2007-09-03 Thread Marcelo Tosatti
On Sun, Sep 02, 2007 at 01:03:12PM -0400, Marcelo Tosatti wrote:
 On Sun, Sep 02, 2007 at 12:50:03PM +0100, David Woodhouse wrote:
  On Sun, 2007-09-02 at 04:10 -0400, Marcelo Tosatti wrote:
   Note: ohci_pci_resume does msleep 20. 
  
  Hm. It's just waiting for the hardware to settle, right? Do the resume
  functions for the devices themselves actually have to wait until this is
  complete, before they can do anything?
  
  It really sounds like we need to decouple suspend/resume of individual
  hardware devices from the full system suspend. It should be almost
  completely asynchronous. Why can't I start running userspace after
  resume, before I've reinitialised the USB thermometer which is plugged
  in? Why don't we just block access to that particular device until it's
  complete (and take that device off-line to save power even when the full
  system isn't suspended).
 
 You're entirely right, but I'm talking about something else:
 
 [0.040909] cpu_idle+0x2e/0x7c - check_pgt_cache+0xb/0x31
 [0.00] check_pgt_cache+0x2f/0x31 - quicklist_trim+0xe/0xe5
 [0.040912] cpu_idle+0x4e/0x7c - default_idle+0x8/0x43
 [0.233593] do_IRQ+0x45/0xdb - irq_enter+0xb/0x2d
 [0.233594] irq_enter+0x22/0x2d - idle_cpu+0x8/0x1b
 
 This interrupt scheduled for 233-40ms is what sounds wrong. It should
 just continue to blaze off the EHCI resume path.

... after 20ms have passed, not almost 200.
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: profiling the resume path

2007-09-03 Thread David Woodhouse
On Mon, 2007-09-03 at 04:05 -0400, Marcelo Tosatti wrote:
  This interrupt scheduled for 233-40ms is what sounds wrong. It should
  just continue to blaze off the EHCI resume path.
 
 ... after 20ms have passed, not almost 200. 

Ah, right. Sorry, I missed the order of magnitude discrepancy. Are we
just miscalculating the setting of the wakeup interrupt? More debugging
of that calculation might be in order...

-- 
dwmw2

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: profiling the resume path

2007-09-03 Thread David Woodhouse
On Mon, 2007-09-03 at 09:27 +0100, David Woodhouse wrote:
 On Mon, 2007-09-03 at 04:05 -0400, Marcelo Tosatti wrote:
   This interrupt scheduled for 233-40ms is what sounds wrong. It should
   just continue to blaze off the EHCI resume path.
  
  ... after 20ms have passed, not almost 200. 
 
 Ah, right. Sorry, I missed the order of magnitude discrepancy. Are we
 just miscalculating the setting of the wakeup interrupt? More debugging
 of that calculation might be in order...

Can you comment out the 'if(tbase_get_deferrable(nte-base)) continue;'
in __next_timer_interrupt() at about line 678 of kernel/timer.c ?

-- 
dwmw2

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: profiling the resume path

2007-09-02 Thread David Woodhouse
On Sun, 2007-09-02 at 04:10 -0400, Marcelo Tosatti wrote:
 Note: ohci_pci_resume does msleep 20. 

Hm. It's just waiting for the hardware to settle, right? Do the resume
functions for the devices themselves actually have to wait until this is
complete, before they can do anything?

It really sounds like we need to decouple suspend/resume of individual
hardware devices from the full system suspend. It should be almost
completely asynchronous. Why can't I start running userspace after
resume, before I've reinitialised the USB thermometer which is plugged
in? Why don't we just block access to that particular device until it's
complete (and take that device off-line to save power even when the full
system isn't suspended).

-- 
dwmw2

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: profiling the resume path

2007-09-02 Thread Marcelo Tosatti
On Sun, Sep 02, 2007 at 12:50:03PM +0100, David Woodhouse wrote:
 On Sun, 2007-09-02 at 04:10 -0400, Marcelo Tosatti wrote:
  Note: ohci_pci_resume does msleep 20. 
 
 Hm. It's just waiting for the hardware to settle, right? Do the resume
 functions for the devices themselves actually have to wait until this is
 complete, before they can do anything?
 
 It really sounds like we need to decouple suspend/resume of individual
 hardware devices from the full system suspend. It should be almost
 completely asynchronous. Why can't I start running userspace after
 resume, before I've reinitialised the USB thermometer which is plugged
 in? Why don't we just block access to that particular device until it's
 complete (and take that device off-line to save power even when the full
 system isn't suspended).

You're entirely right, but I'm talking about something else:

[0.040909] cpu_idle+0x2e/0x7c - check_pgt_cache+0xb/0x31
[0.00] check_pgt_cache+0x2f/0x31 - quicklist_trim+0xe/0xe5
[0.040912] cpu_idle+0x4e/0x7c - default_idle+0x8/0x43
[0.233593] do_IRQ+0x45/0xdb - irq_enter+0xb/0x2d
[0.233594] irq_enter+0x22/0x2d - idle_cpu+0x8/0x1b

This interrupt scheduled for 233-40ms is what sounds wrong. It should
just continue to blaze off the EHCI resume path.

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: profiling the resume path

2007-08-29 Thread David Woodhouse
On Wed, 2007-08-29 at 08:31 -0400, Marcelo Tosatti wrote:
 What you think is the easier/proper way to postpone this console work
 to happen after the resume process is finished?

It's spending all its time waiting for characters to be sent out a
serial port which isn't even going to have anything attached to it?

Have you considered dropping 'console=ttyS0' from the kernel command
line? :)

-- 
dwmw2

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel