On Thu, 23 Sep 2004, Oliver Neukum wrote: > Ok. The meat of swsusp is in kernel/power::swsusp.c. > The call trace of the core is: (comments are mine) > free_some_memory(); // this would require anything that is in the page out path to > be active > device_suspend(3) // this causes device state to be saved in ram > device_power_down(3); // as above, for those that need interrupts off > is_problem = suspend_prepare_image(); //the image is generated > device_power_up(); // wake everything up > suspend_save_image(); // write out the image at least swap device needs to be active > suspend_power_down(); // kill all power > > IMO the crucial part here is the placement of suspend_prepare_image(). > It ensures that the system image that will be written is an image of all > devices powered down and all state (like IP assigned to a network interface) > is saved. > The resume will power down the devices again, unpack the image, and > restore all state. I see no reason the first call to device_suspend() > actually needs to do actually switch power levels. As far as I can tell, > the requirements are: > - no DMA > - no interrupts > - saved state > The trick is decoupling taking the image from writing it out.
Thanks, this clarifies things a lot. In accord with your comments, here is what I see: While saving or restoring the memory image, devices need to be idle. Obviously the procedure won't work if devices are constantly writing/reading memory or generating interrupts. (Actually, it would work with devices generating interrupt requests, provided interrupts were disabled. But the moment they were enabled again, everything would go to pieces.) So yes, no DMA and no IRQs. I agree that "device idle" says nothing about power consumption. There's no particular reason devices have to be powered down while the memory image is saved or restored. So there should be a separate kind of suspend call, to place devices in an "idle" state without regard to power usage. In PCI every state above 0 is "idle" -- but there's no reason that should be true for every device in the system. However when the restored image begins executing, drivers may expect the devices to be in the same state as before the image was created. This argues that idle devices must be in a known state, which is presumably one of minimum power usage. That would be okay, except: I don't know how late in the boot process the memory image is restored. But it's safe to guess that all the drivers may not have been loaded yet -- especially drivers that are loaded by hand or through udev/hotplug. A device controlled by such a driver _cannot_ be placed in the "known state" prior to restoring the memory image. Hence drivers _must_ assume when the memory image starts executing that devices may be in any idle state. A reset should be the first thing they do. (Actually, I suppose the "idle" state could be the same as the state before the driver was loaded. Probably that's not a minimum-power state, which isn't good but it's not terrible. Even if this is true, the first thing the restored driver should do is reset/reinitialize the device.) Mass storage devices with removable media or that use a hot-pluggable transport will cause difficulty. When the memory image starts running it will be impossible to know if the devices or their media have been replaced. Even if a device were somehow able to make that information available, it would only be known to the bootup kernel and hence would be destroyed when the memory image was restored. To be safe, the kernel must assume that all such devices have gone away when the image starts executing. Mounted filesystems would be in trouble. Ideally _all_ non-virtual filesystems should be unmounted during the suspend-to-disk, before the image is prepared. Restoring the device connections later would be tricky, particular since no userspace helper programs (like udev) would be runnable yet. Systems with their root filesystem on a USB drive, for example, would be lucky to resume properly. Assuming all devices are suspended, it is really necessary to resume everything in order to write the memory image to a partition? Or would it suffice to resume just the swap device and its ancestors in the power tree? Come to think of it, is a tree really the right sort of data structure to express the power dependencies in the system? Isn't it possible to have devices that won't work right unless _several_ other devices (not all of them its ancestors) are powered on? Alan Stern ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ [EMAIL PROTECTED] To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel
