On Thu, 23 Sep 2004, Oliver Neukum wrote:

> Ok. The meat of swsusp is in kernel/power::swsusp.c.
> The call trace of the core is: (comments are mine)
> free_some_memory(); // this would require anything that is in the page out path to 
> be active
> device_suspend(3) // this causes device state to be saved in ram
> device_power_down(3); // as above, for those that need interrupts off
> is_problem = suspend_prepare_image(); //the image is generated
> device_power_up(); // wake everything up
> suspend_save_image(); // write out the image at least swap device needs to be active
> suspend_power_down(); // kill all power
> 
> IMO the crucial part here is the placement of suspend_prepare_image().
> It ensures that the system image that will be written is an image of all
> devices powered down and all state (like IP assigned to a network interface)
> is saved.
> The resume will power down the devices again, unpack the image, and
> restore all state. I see no reason the first call to device_suspend()
> actually needs to do actually switch power levels. As far as I can tell,
> the requirements are:
> - no DMA
> - no interrupts
> - saved state
> The trick is decoupling taking the image from writing it out.

Thanks, this clarifies things a lot.  In accord with your comments, here 
is what I see:

While saving or restoring the memory image, devices need to be idle.  
Obviously the procedure won't work if devices are constantly
writing/reading memory or generating interrupts.  (Actually, it would work
with devices generating interrupt requests, provided interrupts were
disabled.  But the moment they were enabled again, everything would go to
pieces.)  So yes, no DMA and no IRQs.

I agree that "device idle" says nothing about power consumption.  There's
no particular reason devices have to be powered down while the memory
image is saved or restored.  So there should be a separate kind of suspend
call, to place devices in an "idle" state without regard to power usage.  
In PCI every state above 0 is "idle" -- but there's no reason that should
be true for every device in the system.

However when the restored image begins executing, drivers may expect the
devices to be in the same state as before the image was created.  This
argues that idle devices must be in a known state, which is presumably one
of minimum power usage.  That would be okay, except:

I don't know how late in the boot process the memory image is restored.  
But it's safe to guess that all the drivers may not have been loaded yet
-- especially drivers that are loaded by hand or through udev/hotplug.  A
device controlled by such a driver _cannot_ be placed in the "known state"
prior to restoring the memory image.  Hence drivers _must_ assume when the
memory image starts executing that devices may be in any idle state.  A 
reset should be the first thing they do.

(Actually, I suppose the "idle" state could be the same as the state 
before the driver was loaded.  Probably that's not a minimum-power state, 
which isn't good but it's not terrible.  Even if this is true, the first 
thing the restored driver should do is reset/reinitialize the device.)

Mass storage devices with removable media or that use a hot-pluggable
transport will cause difficulty.  When the memory image starts running it
will be impossible to know if the devices or their media have been
replaced.  Even if a device were somehow able to make that information
available, it would only be known to the bootup kernel and hence would
be destroyed when the memory image was restored.

To be safe, the kernel must assume that all such devices have gone away
when the image starts executing.  Mounted filesystems would be in trouble.  
Ideally _all_ non-virtual filesystems should be unmounted during the
suspend-to-disk, before the image is prepared.  Restoring the device
connections later would be tricky, particular since no userspace helper
programs (like udev) would be runnable yet.  Systems with their root
filesystem on a USB drive, for example, would be lucky to resume properly.

Assuming all devices are suspended, it is really necessary to resume
everything in order to write the memory image to a partition?  Or would it
suffice to resume just the swap device and its ancestors in the power
tree?  Come to think of it, is a tree really the right sort of data 
structure to express the power dependencies in the system?  Isn't it 
possible to have devices that won't work right unless _several_ other 
devices (not all of them its ancestors) are powered on?

Alan Stern



-------------------------------------------------------
This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170
Project Admins to receive an Apple iPod Mini FREE for your judgement on
who ports your project to Linux PPC the best. Sponsored by IBM.
Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to