Re: 2.6.24 Temperature/speed _not_ normal - no thermal throttling?
Hi Ron, Throttling is meant as a last line of defense before powering-off machine, and not a thermal regulation feature. Please check if you have cpufreq compiled in and able to change frequency. Please open a bug report at bugzilla.kernel.org against ACPI/Thermal. Please attach dmesg output and 'grep . /proc/acpi/thermal/*/*' Thanks, Alex. Ron Rechenmacher wrote: Hi, I believe I am having a critical thermal problem. I do not know if it is limited to the 2.6.24.2 kernel which I am running. I do see there has been some discussion about thermal zones and throttling on the list, but I can not tell if it means that thermal throttling is not working in 2.6.24.2 When I try to build several kernel source rpms, my dell d830 laptop seems to over heat and hang. It's happened 3 times now and I would like to learn what's going on and not let it happen again. I'm a newbie (and have had problems trying to post :), so I do apologize if I've missing something relatively simple or if this is post is not appropriate in any way. I'm running a Scientific Linux 5 (based on RHEL5) distribution and am just running a cpuspeed user space utility --- and therefor do not believe I have any user space process watching temperature. However, in the earlier kernels, I use to be able to (manually) write to /proc/acpi/processor/CPU0/throttling and see a change when read back, but now the write does not seem to do anything. This might be OK as I 'm thinking the kernel and/or the hardware itself might now suppose to be doing the throttling? Anyway, in 3 windows, I run: win1: stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M --timeout 180s win2: while sleep 1;do cat /proc/acpi/thermal_zone/THM/temperature;done win3: tail -f /var/log/messages win4; while sleep 1;do cat /proc/acpi/processor/CPU0/throttling;done In win2, I see the temperature go from 50 C to over 86 C. In win3, before, the temp in win2 reaches 70 C, I see kernel: CPU0: Temperature/speed normal (and also CPU1) and kernel: Machine check events logged The temperature would probably just continue to climb if I ran the test for longer that 180 seconds (the kernel rpms take much longer and do not complete before the system hangs :( In /var/log/mcelog, (running mcelog-0.8pre), I only see Processor core below trip temperature. Throttling disabled messages. This is strange because it seems to be being disabling after never being enabled. (Is there a newer mcelog I should be running?) The fan speed does increase, but the throttling state indication never changes (it's always T0: 100%). It seems that when I build the kernel rpms, the increased fan speed is not enough to keep the temperature form running away. It seems that thermal throttling would be required and is not happening. Should I be doing something from user space? Can I do something from user space? Thanks, Ron - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2-mm1 (x64 thermal build failure)
On Wed, 20 Feb 2008 08:21:33 +0100 Thomas Petazzoni [EMAIL PROTECTED] wrote: Le Tue, 19 Feb 2008 15:21:29 -0800, Andrew Morton [EMAIL PROTECTED] a __crit : ug, sorry, if I'd realised it was like this I'd have said don't bother. Apart from the obvious problem, this means that people will keep breaking CONFIG_DMI=n all the time, because they will forget the ifdefs, and the number of people who test with CONFIG_DMI=n will be small. Yes, #ifdef CONFIG_DMI is not very comfortable. That why I proposed things such as DECLARE_DMI_FIXUP_TABLE(), because it would force people to use these macros, which would then be working correctly depending on DMI=y/n. However, there's still the issue of driver_data that I mentionned in my earlier post. What should I do ? Option 1 ? Option 2 ? Give up with the patch ? Thanks for your comments, Option 1 would be best, I think: 1) Remove the #ifdef CONFIG_DMI around DMI fixup tables and callbacks definition, so that everything exists and gcc is happy. gcc is able to optimize out the DMI fixup table (it is not present in the binary when compiling with DMI=n), but gcc doesn't seem to be able to optimize out the DMI fixup callbacks (they are still present in the binary). So this would leave some unused code in the binary, which is not completely satisfying. gcc _should_ be able to remove the callbacks as long as they are static and have no references. If even the latest gcc versions are still incluing the unreferenced, static function in the final vmlinux then let's get gcc fixed? - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel Version specific vendor override possibilities needed - Revert and provide osi=linux or provide a replacement
Thomas Renninger wrote: Hi, please correct me if I am wrong or missed something: osi=linux was an ability for vendors to provide Linux specific BIOS updates/quirks. _OSI(Linux) returned true until kernel version 2.6.23 (included?). This has been replaced by rather huge black+white lists (at least they most probably will grow huge) to get rid of it again. Goal is to not return _OSI(Linux) as Intel identified it (after inventing it? Doesn't really matter...) as not the right thing to do as it does not consider fixes that might show up in specific kernel versions in the future. This is the only reason I found, did I miss one or the other? Linux is way too generic. The kernels is such fast moving target that changes with every other version. The idea is to replace _OSI(Linux) with _OSI(Needs video POST after resume) etc. To allow the BIOS to figure out what _exactly_ it needs to do, rather then guessing based on the kernels version etc. A lot has been discussed in linux-acpi, though mostly hidden in related threads. Ask Henrique, he's been trying to come up with a proper _OSI() interface. See for example this thread: http://www.mail-archive.com/linux-acpi@vger.kernel.org/msg12262.html The idea to remove _OSI(Linux) it to get the hardware vendors to stop using it _now_, we don't want them to use it any longer. It will take some time to come up with a proper interface, but at least we'll have less legacy to deal with. Some questions and suggestions: Is there already a replacement or will there pop up something soon, which I may have missed? This is an interface to the outside world (out of the kernel... in this case not to userspace via /proc /sys ioctl, but to the BIOS). Such interfaces should have a very long lifetime, once they are implemented. In this case it should have an even much longer life time than any /sys or whatever userspace interface. Telling vendors that this will vanish and giving them time to remove it from their BIOSes or better replace it with something else is the way to go here IMO. The current policy is to just return zero on _OSI(Linux). I don't get it why you do this. You break machines on purpose. Machines were vendors possibly have invested time to improve them for Linux. Why don't you just announce it, write it down in Documentation, also write it to dmesg, instead of pls send acpidump to acpi list, something like: osi=linux is deprecated and will get removed (ok there popped up a way too much of these, but maybe dmiblacklist the message, don't do any functional change for now...). Maybe that just didn't get outside of the linux-acpi mailinglist, but that that _OSI(Linux) is deprecated has been known for some time now. But you are right, it was never publicly announced. First users were asked to try acpi_osi=!Linux (sometime in the .23/.24 timeframe?)and see if it works better. Now !Linux is default, and they are asked to try acpi_osi=Linux so that we can fill the blacklist. Why shouldn't I remove the whole dmi black/white listing from our OpenSUSE kernel and return true for _OSI(Linux), this probably fixes a lot machines and avoids bug reports (and annoyed users). I plan to do this rather soon if I don't get some good arguments. IMO this should also be done mainline. It is a pity that 2.6.24 now has this. IMO this should even go back to a 2.6.24.X stable kernel. Just let it in and announce to not use it and provide something else (this has time then...). --- Here a suggestion for an enhanced Linux Operation System Identification interface for ACPI: For general BIOS hot-fixes a check for osi=linux is sufficient for vendors and allows them to provide a fix without risking breakage of their Windows OS. This one should stay. No, it's not sufficient, it's useless. Linux - what should that stand for? How should the BIOS vendors interpret it? It's totally ambiguous. It was removed, and should stay removed. BIOS can do _OSI(Windows 2006) because Windows 2006 defines a non-moving target. MS will not change how Vista behaves without changing the string (they sometimes update the string in service packs). The problem is you do not know in what kernel version this might get fixed at the time the BIOS is published with the short term workaround. While this knob is essential for vendors for pre-loads, it might break the machine if the user is trying to install a newer Linux distribution with a newer Kernel where the problem got fixed. Then the workaround might even slow down or break the system... An example: Lenovo wants to get rid of brightness switching via their old method (int10?). But this needs in-kernel graphics driver support for Intel graphics cards. Therefore ibm_acpi currently simulates this, the specified ACPI brightness interface cannot be used. In which kernel version in-kernel graphics drivers will be supported and Lenovo can safely throw out int10 brightness switching from their BIOSes is not known yet. I
BIOS bug introduced on some ASUS machines and some Mainboards
Hi, JFYI: Due to a silly BIOS bug which seems to get copied to a wider range of Asus (and friends) machines/mainboards following machines/mainboards will show a protection fault when the thermal ACPI driver gets loaded: Biostar TA770 ( Bios A78XA125 2008-01-28 ) Foxconn M520A ( Bios 784F1P05 2008-02-01 ) Asus M2N32-SLI Deluxe (BIOS version still working: 1201 Broken BIOS versions: 1302, 1503, 1603, Beta 1701) Asus M2A HDMI A bug with additional info can be found here: https://bugzilla.novell.com/show_bug.cgi?id=350017 According to Robert (Thanks for looking into this) this should be fixed in ACPICA version 20071019. The fix or say workaround and the problem are described at the end. AFAIK current ACPICA version (if the date in include/acpi/acconfig.h is correct) in mainline kernel still is: 20070126 So if you have one of these boards, better wait a bit with a BIOS update. If you have similar problems on other machines, please let us know, so that others don't run into this. Affected people should see a stack trace like that in dmesg when trying to load the thermal module: RIP: 0010:[] [] acpi_ns_map_handle_to_node+0x14/0x1d ... [] acpi_get_data+0x3e/0x6e [] acpi_bus_get_device+0x25/0x68 [] :thermal:acpi_thermal_trip_seq_show+0x12b/0x257 [] seq_read+0x105/0x28b [] vfs_read+0xcb/0x153 [] sys_read+0x45/0x6e [] system_call+0x7e/0x83 The bug is, that they added dual or quad core (I expect the latter not sure right now) support to their DSDT and renamed the CPU objects. But they forgot to rename the reference of the passive cooling device (which is a processor object). When you have the kernel fix that can handle this gracefully you should only see: Hmm, older kernels had a warning, it seems this has been removed. It got removed very recently (02-02-08): git commit ce44e19701ac1de004815c225585ff617c5948b4 This message should get added again, I'll send a separate patch, you then should see a line in dmesg: Invalid passive threshold If the kernel can handle the BIOS bug gracefully. Thomas - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel Version specific vendor override possibilities needed - Revert and provide osi=linux or provide a replacement
On Wed, 20 Feb 2008, Tomas Carnecky wrote: A lot has been discussed in linux-acpi, though mostly hidden in related threads. Ask Henrique, he's been trying to come up with a proper _OSI() interface. See for example this thread: No, I haven't. Len is the mastermind behind it, AFAIK. I am just one of the interested parties. I have been burned by halfway-done jobs in the kernel once-too-many already IMO, so I am sticking around the threads to make sure THIS time, at least I will also be at fault if it is done wrong. The fact that Lenovo (and therefore, ThinkPads) is one of the vendors directly affected by OSI(Linux) issues, has a LOT to do with my continued participation on this issue, too. The saner the ThinkPad firmware is, from a Linux standpoint, the better for me. -- One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie. -- The Silicon Valley Tarot Henrique Holschuh - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel Version specific vendor override possibilities needed - Revert and provide osi=linux or provide a replacement
On Wed, 2008-02-20 at 11:31 +0100, Tomas Carnecky wrote: Thomas Renninger wrote: Hi, please correct me if I am wrong or missed something: osi=linux was an ability for vendors to provide Linux specific BIOS updates/quirks. _OSI(Linux) returned true until kernel version 2.6.23 (included?). This has been replaced by rather huge black+white lists (at least they most probably will grow huge) to get rid of it again. Goal is to not return _OSI(Linux) as Intel identified it (after inventing it? Doesn't really matter...) as not the right thing to do as it does not consider fixes that might show up in specific kernel versions in the future. This is the only reason I found, did I miss one or the other? Linux is way too generic. The kernels is such fast moving target that changes with every other version. The idea is to replace _OSI(Linux) with _OSI(Needs video POST after resume) etc. To allow the BIOS to figure out what _exactly_ it needs to do, rather then guessing based on the kernels version etc. Ok, I see the point and I agree that kernel versions only is a bit problematic. I still like it, because it seem to me the only way that is a bit generic. So the suggestion is to introduce very specific strings (hopefully not much) for specific milestones/patches similar to Windows 2006 SP1 we might have a in-kernel graphics, but this hopefully will be the only one... The string(s) are listed on acpi.sourceforge.com? Hmm, not sure whether this is such a perfect idea or could work at all, maybe for very extrem/huge changes. But that is not enough. There is nothing wrong with giving the vendors the possibility to check whether they are running on Linux. Document the problem of future kernel versions for vendors. I expect most of the if(linux) AML fixups from vendors make very much sense to keep them forever. For all the stuff like: Windows likes to have it this way, it violates all kinds of specifications for it, so for Linux we do it right and provide a proper interface. There are dozens of more arguments.. A lot has been discussed in linux-acpi, though mostly hidden in related threads. Ask Henrique, he's been trying to come up with a proper _OSI() interface. See for example this thread: http://www.mail-archive.com/linux-acpi@vger.kernel.org/msg12262.html The idea to remove _OSI(Linux) it to get the hardware vendors to stop using it _now_, we don't want them to use it any longer. It will take some time to come up with a proper interface, but at least we'll have less legacy to deal with. Some questions and suggestions: Is there already a replacement or will there pop up something soon, which I may have missed? This is an interface to the outside world (out of the kernel... in this case not to userspace via /proc /sys ioctl, but to the BIOS). Such interfaces should have a very long lifetime, once they are implemented. In this case it should have an even much longer life time than any /sys or whatever userspace interface. Telling vendors that this will vanish and giving them time to remove it from their BIOSes or better replace it with something else is the way to go here IMO. The current policy is to just return zero on _OSI(Linux). I don't get it why you do this. You break machines on purpose. Machines were vendors possibly have invested time to improve them for Linux. Why don't you just announce it, write it down in Documentation, also write it to dmesg, instead of pls send acpidump to acpi list, something like: osi=linux is deprecated and will get removed (ok there popped up a way too much of these, but maybe dmiblacklist the message, don't do any functional change for now...). Maybe that just didn't get outside of the linux-acpi mailinglist, but that that _OSI(Linux) is deprecated has been known for some time now. But you are right, it was never publicly announced. First users were asked to try acpi_osi=!Linux (sometime in the .23/.24 timeframe?)and see if it works better. Now !Linux is default, and they are asked to try acpi_osi=Linux so that we can fill the blacklist. It is not working like that! You cannot provide an important interface. Then realize that it is not exactly what it was intended for and just rip it out. While modifying sysfs and procfs without announcing must not happen, it is by far not that bad. Just keep it, tell the vendors that they should not use it, put in ugly messages if they do (if you like black/white list the message...). If this really should vanish, what is IMO a really bad idea. But do not just remove it! Why shouldn't I remove the whole dmi black/white listing from our OpenSUSE kernel and return true for _OSI(Linux), this probably fixes a lot machines and avoids bug reports (and annoyed users). I plan to do this rather soon if I don't get some good arguments. IMO this should also be done mainline. It is a pity that 2.6.24 now has this.
Re: [PATCH] PM: Remove unbalanced mutex_unlock() from dpm_resume()
On Wed, 20 Feb 2008, Rafael J. Wysocki wrote: Hi Greg, Please consider taking the following fix for 2.6.25. Don't just consider it! :-) It's a real bug fix. Thanks, Rafael --- From: Rafael J. Wysocki [EMAIL PROTECTED] Remove an unnecessary unlocking of dpm_list_mtx in the error path in drivers/base/power/main.c:dpm_suspend() . Signed-off-by: Rafael J. Wysocki [EMAIL PROTECTED] --- drivers/base/power/main.c |1 - 1 file changed, 1 deletion(-) Index: linux-2.6/drivers/base/power/main.c === --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -479,7 +479,6 @@ static int dpm_suspend(pm_message_t stat mutex_lock(dpm_list_mtx); if (list_empty(dev-power.entry)) list_add(dev-power.entry, dpm_locked); - mutex_unlock(dpm_list_mtx); break; } mutex_lock(dpm_list_mtx); Acked-by: Alan Stern [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Feb 21, 2008 1:28 AM, Linus Torvalds [EMAIL PROTECTED] wrote: Try suspend-and-resume without X. Works without those two functions. Also, try it on one of the more modern laptops - even *with* X. Again, still works. Tested on Lenovo X60s. Basically, the kernel wants to be able to do what X does, because it means that when it works, it works _so_ much better than doing it in X. So getting it working is definitely worth it. That said, before you do anything else, try if suspend-to-RAM works. Yes, still works. That's the primary goal for this code anyway, and if it works that gives a good hint. Ok, what's next? Thanks, Jeff. - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Feb 21, 2008 1:17 AM, Jeff Chua [EMAIL PROTECTED] wrote: On Feb 20, 2008 2:19 PM, Jeff Chua I'll try the idle=poll to see if that works and will try some printk Tried idle=poll but it has not effect. Thanks, Jeff. - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Feb 20, 2008 2:19 PM, Jeff Chua I'll try the idle=poll to see if that works and will try some printk I don't know what exactly the i915_suspend() and i915_resume() are supposed to do because it works better without them. After inserting return 0; right at the top of those two functions, suspend (and power-off properly), and resume (without green screen) works just fine. I would like to know what they're for. Tested suspend-to-ram, and suspend-to-disk, both console and X on notebook internal LCD display, all works without these two functions. But, anyway, got down to just one line in i915_drv.c causing the hang during suspend. pci_set_power_state(dev-pdev, PCI_D3hot);. And green screen problem during resume is caused by i915_restore_vga(dev); So, let me where to go from here. Thanks, Jeff. --- linux/drivers/char/drm/i915_drv.c.bad 2008-02-20 11:29:14 +0800 +++ linux/drivers/char/drm/i915_drv.c 2008-02-21 00:58:37 +0800 @@ -369,7 +369,7 @@ if (state.event == PM_EVENT_SUSPEND) { /* Shut down the device */ pci_disable_device(dev-pdev); - pci_set_power_state(dev-pdev, PCI_D3hot); + //pci_set_power_state(dev-pdev, PCI_D3hot); } return 0; @@ -521,7 +521,7 @@ for (i = 0; i 3; i++) I915_WRITE(SWF30 + (i 2), dev_priv-saveSWF2[i]); - i915_restore_vga(dev); + //i915_restore_vga(dev); return 0; } - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thu, 21 Feb 2008, Jeff Chua wrote: After inserting return 0; right at the top of those two functions, suspend (and power-off properly), and resume (without green screen) works just fine. I would like to know what they're for. Try suspend-and-resume without X. Also, try it on one of the more modern laptops - even *with* X. Basically, the kernel wants to be able to do what X does, because it means that when it works, it works _so_ much better than doing it in X. So getting it working is definitely worth it. That said, before you do anything else, try if suspend-to-RAM works. That's the primary goal for this code anyway, and if it works that gives a good hint. Suspend-to-disk is fundamentally different, and it's entirely possible that for the suspend-to-disk case we should just say screw trying to suspend/resume graphics, since you'll have the BIOS resuming text-mode anyway, and there are no performance or debugging advantages. Linus - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thu, 21 Feb 2008, Jeff Chua wrote: Works without those two functions. Ahh. You're using the BIOS to re-initialize your video, aren't you? If STR works without X, then you have something else resuming graphics, and that may be what then interacts badly with the fact that the kernel also does so. Ok, what's next? Let's try to narrow it down to what the interaction is. Are you using something like acpi_sleep=s3_bios or similar? That's what the kernel support is supposed to make unnecessary in the long run, along with all the video mode flickering (ie we should be able to resume to the video mode we want, not flicker through unnecessary modes). Linus - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Feb 21, 2008 1:52 AM, Linus Torvalds [EMAIL PROTECTED] wrote: Ahh. You're using the BIOS to re-initialize your video, aren't you? I don't know. Just pure simple s2ram without any options. Let's try to narrow it down to what the interaction is. Are you using something like acpi_sleep=s3_bios or similar? No. Not additional command line option except for resume=/dev/sda3 reboot=bios That's what the kernel support is supposed to make unnecessary in the long run, Ok, understand now. Jeff. - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Feb 21, 2008 1:28 AM, Linus Torvalds [EMAIL PROTECTED] wrote: That said, before you do anything else, try if suspend-to-RAM works. Linus, guess I missed this part ... so before touch anything, I did tried suspend-to-ram, and it works on console and in X. And suspend-to-disk hangs, but I can still press and hold the power button to power it off. Then upon powering on and resume, I get the ugly green console screen. I can still type and move around. Starting X runs fine. Ctrl-Alt-Del or switching back to console will get back to the green screen. Thanks, Jeff. - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wednesday, February 20, 2008 9:17 am Jeff Chua wrote: On Feb 20, 2008 2:19 PM, Jeff Chua I'll try the idle=poll to see if that works and will try some printk I don't know what exactly the i915_suspend() and i915_resume() are supposed to do because it works better without them. After inserting return 0; right at the top of those two functions, suspend (and power-off properly), and resume (without green screen) works just fine. I would like to know what they're for. They're for saving and restoring GPU state across suspend/resume. They're particularly useful if your machine doesn't re-POST at resume time. In that case your GPU may be totally uninitialized, so either the kernel or X has to set it up for you (X only does that partially). Tested suspend-to-ram, and suspend-to-disk, both console and X on notebook internal LCD display, all works without these two functions. But, anyway, got down to just one line in i915_drv.c causing the hang during suspend. pci_set_power_state(dev-pdev, PCI_D3hot);. Interesting, which chipset do you have? AFAIK that shouldn't cause a hang. And green screen problem during resume is caused by i915_restore_vga(dev); I know I fixed that problem in at least one configuration... Can you try: # echo test /sys/power/disk # echo disk /sys/power/state and see if that also turns your screen green? Also, getting a GPU register dump would be helpful. The intel_reg_dumper tool is built as part of the xf86-video-driver build (git://anongit.freedesktop.org/git/xorg/driver/xf86-video-intel), can you pull that down and try it out? Thanks, Jesse - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
Jeff Chua wrote: On Feb 20, 2008 2:19 PM, Jeff Chua I'll try the idle=poll to see if that works and will try some printk I don't know what exactly the i915_suspend() and i915_resume() are supposed to do because it works better without them. After inserting return 0; right at the top of those two functions, suspend (and power-off properly), and resume (without green screen) works just fine. .. Does this machine have more than one CPU core? If so.. Does your kernel have CONFIG_HOTPLUG_CPU=y (if not, enable it). ?? - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel Version specific vendor override possibilities needed - Revert and provide osi=linux or provide a replacement
On Wed, 20 Feb 2008, Matthew Garrett wrote: Let's look at this differently. Most hardware is produced by vendors who don't care about Linux. We need to make that hardware work anyway. The only way we can achieve that is to be bug-compatible with Windows. Therefore, any way in which Linux behaviour varies from Windows behaviour is a bug. The only reason to export any indication that the kernel is Linux is because our behaviour is not identical to Windows. But, given that that's a bug, the solution should be to fix Linux and not to encourage vendors to put workarounds in their firmware. That punishes vendors which actually care about Linux. These are quite rare in the laptop and desktop market, but they do exist. And such vendors are quite *common* in the enterprise hardware market which doesn't run Windows. -- One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie. -- The Silicon Valley Tarot Henrique Holschuh - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel Version specific vendor override possibilities needed - Revert and provide osi=linux or provide a replacement
On Wed, 2008-02-20 at 17:32 +, Matthew Garrett wrote: Let's look at this differently. Most hardware is produced by vendors who don't care about Linux. We need to make that hardware work anyway. Not really. If you buy machine noname XY, you have to face the fact that HW may not work on Linux correctly. You can try fix to it, but you cannot write a driver for WLAN card from vendor noname and card reader from never heard of that company. If you buy the wrong graphics card you may end up without 3D and whatever else cool features the card supports. So at least since HP, Dell, Lenovo (also Acer?) are selling pre-loaded Linux laptops, you should be smart enough to take such a thing where the BIOS is adjusted to run on Linux or you pretty much have to reckon with trouble. So being Windows compatible is nice, but sticking to specifications is more important (we are far away from and never will be Windows compatibility in WMI implementation right?). Imagine a vendor using if(linux) provides as a whole SSDT with all the fan and thermal implementations perfectly fit to the ACPI specification and therefore stick to the Linux kernel implementations? Next point is that if vendors pre-load their model with a specific distribution, they need such a knob. Please do not think about what happens when I upgrade to the latest kernel (which should still be no problem when they know how to use this). Think about how these vendors should fix a complex Linux bug via a BIOS hot-fix update ... Think about a functional change they have to implement in their BIOS for a Windows Vista SPX change. While the machine may still run fine with the latest mainline kernel, the kernel they have to provide support for will break. I see the problem with this scenario, but try to think from Dell's/HP's/... point of view. They want to have such a thing. The only way we can achieve that is to be bug-compatible with Windows. Therefore, any way in which Linux behaviour varies from Windows behaviour is a bug. The only reason to export any indication that the kernel is Linux is because our behaviour is not identical to Windows. Linux behaviour is not identical to Windows, never will be and after vendors start pre-loading also do not need to be... But, given that that's a bug, the solution should be to fix Linux and not to encourage vendors to put workarounds in their firmware. I see it the other way round. Encourage vendors to fix their BIOSes, instead of putting Windows compatibility workarounds into the kernel. Thomas - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thu, 21 Feb 2008, Jeff Chua wrote: That said, before you do anything else, try if suspend-to-RAM works. Linus, guess I missed this part ... so before touch anything, I did tried suspend-to-ram, and it works on console and in X. Ok, so this is with clean current -git, and nothing disabled? And suspend-to-disk hangs, but I can still press and hold the power button to power it off. The press and hold for five seconds is actually a hardware feature of the southbridge (well, I guess there is software in there too, but it's the embedded kind). So the fact that it powers off at that point means nothing, it just means that ok, your kernel is hung, but the hardware still works ;) This *sounds* like some part of the suspend-to-disk sequence is doing something stupid like trying to access the screen after it has been turned off, which doesn't surprise me at all. My oft-stated opinion has been that suspend-to-disk isn't a suspend at all, and should never have been confused with suspending anything. It's snapshot-and-restore, and my opinion is that: - it should *never* call suspend()/resume() at all (that should be reserved purely for suspend-to-RAM and has real power management issues!) - it should have a totally separate halt/unhalt/restore thing that has nothing what-so-ever to do with power management, and is purely about stopping the hardware for things like USB and network cards (which otherwise do things like scan their command lists asynchronously) and making sure that the driver state is consistent with that stopped hw state. - the people who confuse snapshot/restore with suspend/resume are horrible people that cause problems exactly because driver people then get those things mixed up, and something like the video suspend/resume should probably never have impacted suspend-to-disk in the first place! HOWEVER, that's a separate fight I've had, and in the meantime: Then upon powering on and resume, I get the ugly green console screen. I can still type and move around. Starting X runs fine. Ctrl-Alt-Del or switching back to console will get back to the green screen. .. so this implies that while the laptop apparently hung at the end of the snapshotting, the snapshotting did actually work, and it must have hung at the very end, presumably when it tried to actually turn the power off. So there seems to be two (probably largely independent) problems: - the hang at shutdown that requires you to press-and-hold the power button to actually cut power. At a guess: putting the VGA device into D3hot makes the ACPI code that actually does the shutoff unhappy. Probably because it wants to access the device, and ends up not ever getting the replies it wants, since the hardware has been turned off. - the fact that we restore something wrong for you and the screen is green. At a guess: the restore_vga ends up restoring some state that wasn't correctly and fully saved. IOW, I think your patch that disables the two lines actually ends up pretty much matching the two *different* problems. Can you confirm that doing those two parts of that patch individually actually does individually fix the two issues? (Ie disabling D3hot makes it shut down nicely but resume with green text, while disabling just restore_vga() ends up with shutdown problems, but once you press-and-hold the power button, the thing will then restore nicely)+ Linus - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel Version specific vendor override possibilities needed - Revert and provide osi=linux or provide a replacement
On Wed, Feb 20, 2008 at 07:21:57PM +0100, Thomas Renninger wrote: On Wed, 2008-02-20 at 17:32 +, Matthew Garrett wrote: Let's look at this differently. Most hardware is produced by vendors who don't care about Linux. We need to make that hardware work anyway. Not really. If you buy machine noname XY, you have to face the fact that HW may not work on Linux correctly. No. You have failed. Do not pass go. You can try fix to it, but you cannot write a driver for WLAN card from vendor noname and card reader from never heard of that company. If you buy the wrong graphics card you may end up without 3D and whatever else cool features the card supports. That's fine. Some hardware support is difficult. Some hardware support is not difficult. The only possible situation in which exporting some sort of OSI value for Linux is helpful is when the firmware authors know what the difference between the Linux and Windows behaviours are. If we know that, we can fix it. Fixing it not only fixes the machine in question, it probably fixes a large number of other machines. This is a preferable solution. So at least since HP, Dell, Lenovo (also Acer?) are selling pre-loaded Linux laptops, you should be smart enough to take such a thing where the BIOS is adjusted to run on Linux or you pretty much have to reckon with trouble. Argh. No. If the BIOS is adjusted to run on Linux, it indicates that we've failed. Completely. Utterly. So being Windows compatible is nice, but sticking to specifications is more important (we are far away from and never will be Windows compatibility in WMI implementation right?). No! What's the point in sticking to specifications when there's only one implementation? Windows is the de-facto specification for ACPI, and we should follow it. Next point is that if vendors pre-load their model with a specific distribution, they need such a knob. Fixing Linux is easier than fixing firmware in almost every single case. If vendors are installing Linux without working with the distribution vendor, then that's unfortunate and they're likely to have problems. I'm not going to prioritise them above the huge number of users buying hardware from vendors who aren't as foolish. Please do not think about what happens when I upgrade to the latest kernel (which should still be no problem when they know how to use this). Think about how these vendors should fix a complex Linux bug via a BIOS hot-fix update ... They shouldn't. They should push out a software update. The only way we can achieve that is to be bug-compatible with Windows. Therefore, any way in which Linux behaviour varies from Windows behaviour is a bug. The only reason to export any indication that the kernel is Linux is because our behaviour is not identical to Windows. Linux behaviour is not identical to Windows, never will be and after vendors start pre-loading also do not need to be... Wrong. But, given that that's a bug, the solution should be to fix Linux and not to encourage vendors to put workarounds in their firmware. I see it the other way round. Encourage vendors to fix their BIOSes, instead of putting Windows compatibility workarounds into the kernel. By which you mean Cater for a small market, rather than the large one? No. That would be ridiculous. Our compatibility is sufficiently good that I'm not going to recommend users buy one of the tiny number of laptops available with a supported Linux install over buying a laptop that actually fits their needs. -- Matthew Garrett | [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Feb 21, 2008 1:50 AM, Jesse Barnes [EMAIL PROTECTED] wrote: I would like to know what they're for. They're for saving and restoring GPU state across suspend/resume. They're particularly useful if your machine doesn't re-POST at resume time. In that case your GPU may be totally uninitialized, so either the kernel or X has to set it up for you (X only does that partially). Ok. A lot to digest. Interesting, which chipset do you have? AFAIK that shouldn't cause a hang. (II) intel(0): Integrated Graphics Chipset: Intel(R) 945GM I know I fixed that problem in at least one configuration... Can you try: # echo test /sys/power/disk # echo disk /sys/power/state and see if that also turns your screen green? Yes, still green. But I got it to actual reboot with ... echo reboot /sys/power/disk So, next I'll try shutdown to see if it work. I was using platform. Also, getting a GPU register dump would be helpful. The intel_reg_dumper tool Attached are the two dumps from console. One prior to suspend, and one after resume. Thanks, Jeff. (II): DumpRegsBegin (II):VCLK_DIVISOR_VGA0: 0x00031108 (n = 3, m1 = 17, m2 = 8) (II):VCLK_DIVISOR_VGA1: 0x00031406 (n = 3, m1 = 20, m2 = 6) (II):VCLK_POST_DIV: 0x00020002 (vga0 p1 = 4, p2 = 2, vga1 p1 = 2, p2 = 2) (II):DPLL_TEST: 0x00010001 () (II): CACHE_MODE_0: 0x6820 (II): D_STATE: 0x (II):DSPCLK_GATE_D: 0x1000 (clock gates disabled: DPLUNIT) (II): RENCLK_GATE_D1: 0x (II): RENCLK_GATE_D2: 0x (II):SDVOB: 0x0048 (disabled, pipe A, stall disabled, not detected) (II):SDVOC: 0x0048 (disabled, pipe A, stall disabled, not detected) (II): SDVOUDI: 0x0077 (II): DSPARB: 0x1d9c (II): DSPFW1: 0x (II): DSPFW2: 0x (II): DSPFW3: 0x (II): ADPA: 0x40008c18 (disabled, pipe B, +hsync, +vsync) (II): LVDS: 0xc300 (enabled, pipe B, 18 bit, 1 channel) (II): DVOA: 0x (disabled, pipe A, no stall, -hsync, -vsync) (II): DVOB: 0x0048 (disabled, pipe A, no stall, -hsync, -vsync) (II): DVOC: 0x0048 (disabled, pipe A, no stall, -hsync, -vsync) (II): DVOA_SRCDIM: 0x (II): DVOB_SRCDIM: 0x (II): DVOC_SRCDIM: 0x (II): PP_CONTROL: 0x0001 (power target: on) (II):PP_STATUS: 0xc008 (on, ready, sequencing idle) (II): PFIT_CONTROL: 0x80002668 (II): PFIT_PGM_RATIOS: 0x (II): PORT_HOTPLUG_EN: 0x0020 (II):PORT_HOTPLUG_STAT: 0x (II): DSPACNTR: 0x (disabled, pipe A) (II): DSPASTRIDE: 0x (0 bytes) (II): DSPAPOS: 0x (0, 0) (II): DSPASIZE: 0x (1, 1) (II): DSPABASE: 0x (II): DSPASURF: 0x (II): DSPATILEOFF: 0x (II):PIPEACONF: 0x (disabled, single-wide) (II): PIPEASRC: 0x027f01df (640, 480) (II):PIPEASTAT: 0x8203 (status: FIFO_UNDERRUN VSYNC_INT_STATUS VBLANK_INT_STATUS OREG_UPDATE_STATUS) (II): FBC_CFB_BASE: 0x (II): FBC_LL_BASE: 0x (II): FBC_CONTROL: 0x (II): FBC_COMMAND: 0x (II): FBC_STATUS: 0x2000 (II): FBC_CONTROL2: 0x (II):FBC_FENCE_OFF: 0x (II): FBC_MOD_NUM: 0x (II): FPA0: 0x00031108 (n = 3, m1 = 17, m2 = 8) (II): FPA1: 0x00031108 (n = 3, m1 = 17, m2 = 8) (II): DPLL_A: 0x0483 (disabled, non-dvo, VGA, default clock, DAC/serial mode, p1 = 8, p2 = 10, SDVO mult 1) (II):DPLL_A_MD: 0x (II): HTOTAL_A: 0x031f027f (640 active, 800 total) (II): HBLANK_A: 0x03170287 (648 start, 792 end) (II): HSYNC_A: 0x02ef028f (656 start, 752 end) (II): VTOTAL_A: 0x020c01df (480 active, 525 total) (II): VBLANK_A: 0x020401e7 (488 start, 517 end) (II): VSYNC_A: 0x01eb01e9 (490 start, 492 end) (II):BCLRPAT_A: 0x (II): VSYNCSHIFT_A: 0x (II): DSPBCNTR: 0x4900 (disabled, pipe B) (II): DSPBSTRIDE: 0x0280 (640 bytes) (II): DSPBPOS: 0x (0, 0) (II): DSPBSIZE: 0x018f02cf (720, 400) (II): DSPBBASE: 0x (II): DSPBSURF: 0x (II): DSPBTILEOFF: 0x (II):PIPEBCONF: 0x8000 (enabled, single-wide) (II): PIPEBSRC: 0x027f018f (640, 400) (II):PIPEBSTAT: 0x8202 (status: FIFO_UNDERRUN VSYNC_INT_STATUS VBLANK_INT_STATUS) (II): FPB0: 0x00020e09 (n = 2, m1 = 14, m2 = 9) (II): FPB1: 0x00031108 (n = 3, m1 =
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wednesday, February 20, 2008 10:29 am Jeff Chua wrote: I know I fixed that problem in at least one configuration... Can you try: # echo test /sys/power/disk # echo disk /sys/power/state and see if that also turns your screen green? Yes, still green. But I got it to actual reboot with ... echo reboot /sys/power/disk So, next I'll try shutdown to see if it work. I was using platform. Ok, that would be good to try. Also, getting a GPU register dump would be helpful. The intel_reg_dumper tool Attached are the two dumps from console. One prior to suspend, and one after resume. Looks like the AR registers are hosed, which is what I thought I fixed... Can you attach your i915_drv.c file just so I can sanity check it? Thanks, Jesse - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Feb 21, 2008 2:53 AM, Jesse Barnes [EMAIL PROTECTED] wrote: So, next I'll try shutdown to see if it work. I was using platform. Ok, that would be good to try. shutdown does power down properly. But still green on resume. Looks like the AR registers are hosed, which is what I thought I fixed... Can you attach your i915_drv.c file just so I can sanity check it? Attached. Thanks, Jeff. /* i915_drv.c -- i830,i845,i855,i865,i915 driver -*- linux-c -*- */ /* * * Copyright 2003 Tungsten Graphics, Inc., Cedar Park, Texas. * All Rights Reserved. * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the * Software), to deal in the Software without restriction, including * without limitation the rights to use, copy, modify, merge, publish, * distribute, sub license, and/or sell copies of the Software, and to * permit persons to whom the Software is furnished to do so, subject to * the following conditions: * * The above copyright notice and this permission notice (including the * next paragraph) shall be included in all copies or substantial portions * of the Software. * * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * */ #include drmP.h #include drm.h #include i915_drm.h #include i915_drv.h #include drm_pciids.h static struct pci_device_id pciidlist[] = { i915_PCI_IDS }; enum pipe { PIPE_A = 0, PIPE_B, }; static bool i915_pipe_enabled(struct drm_device *dev, enum pipe pipe) { struct drm_i915_private *dev_priv = dev-dev_private; if (pipe == PIPE_A) return (I915_READ(DPLL_A) DPLL_VCO_ENABLE); else return (I915_READ(DPLL_B) DPLL_VCO_ENABLE); } static void i915_save_palette(struct drm_device *dev, enum pipe pipe) { struct drm_i915_private *dev_priv = dev-dev_private; unsigned long reg = (pipe == PIPE_A ? PALETTE_A : PALETTE_B); u32 *array; int i; if (!i915_pipe_enabled(dev, pipe)) return; if (pipe == PIPE_A) array = dev_priv-save_palette_a; else array = dev_priv-save_palette_b; for(i = 0; i 256; i++) array[i] = I915_READ(reg + (i 2)); } static void i915_restore_palette(struct drm_device *dev, enum pipe pipe) { struct drm_i915_private *dev_priv = dev-dev_private; unsigned long reg = (pipe == PIPE_A ? PALETTE_A : PALETTE_B); u32 *array; int i; if (!i915_pipe_enabled(dev, pipe)) return; if (pipe == PIPE_A) array = dev_priv-save_palette_a; else array = dev_priv-save_palette_b; for(i = 0; i 256; i++) I915_WRITE(reg + (i 2), array[i]); } static u8 i915_read_indexed(u16 index_port, u16 data_port, u8 reg) { outb(reg, index_port); return inb(data_port); } static u8 i915_read_ar(u16 st01, u8 reg, u16 palette_enable) { inb(st01); outb(palette_enable | reg, VGA_AR_INDEX); return inb(VGA_AR_DATA_READ); } static void i915_write_ar(u8 st01, u8 reg, u8 val, u16 palette_enable) { inb(st01); outb(palette_enable | reg, VGA_AR_INDEX); outb(val, VGA_AR_DATA_WRITE); } static void i915_write_indexed(u16 index_port, u16 data_port, u8 reg, u8 val) { outb(reg, index_port); outb(val, data_port); } static void i915_save_vga(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev-dev_private; int i; u16 cr_index, cr_data, st01; /* VGA color palette registers */ dev_priv-saveDACMASK = inb(VGA_DACMASK); /* DACCRX automatically increments during read */ outb(0, VGA_DACRX); /* Read 3 bytes of color data from each index */ for (i = 0; i 256 * 3; i++) dev_priv-saveDACDATA[i] = inb(VGA_DACDATA); /* MSR bits */ dev_priv-saveMSR = inb(VGA_MSR_READ); if (dev_priv-saveMSR VGA_MSR_CGA_MODE) { cr_index = VGA_CR_INDEX_CGA; cr_data = VGA_CR_DATA_CGA; st01 = VGA_ST01_CGA; } else { cr_index = VGA_CR_INDEX_MDA; cr_data = VGA_CR_DATA_MDA; st01 = VGA_ST01_MDA; } /* CRT controller regs */ i915_write_indexed(cr_index, cr_data, 0x11, i915_read_indexed(cr_index, cr_data, 0x11) (~0x80));
Re: Kernel Version specific vendor override possibilities needed - Revert and provide osi=linux or provide a replacement
On Wed, Feb 20, 2008 at 03:23:40PM -0300, Henrique de Moraes Holschuh wrote: On Wed, 20 Feb 2008, Matthew Garrett wrote: Let's look at this differently. Most hardware is produced by vendors who don't care about Linux. We need to make that hardware work anyway. The only way we can achieve that is to be bug-compatible with Windows. Therefore, any way in which Linux behaviour varies from Windows behaviour is a bug. The only reason to export any indication that the kernel is Linux is because our behaviour is not identical to Windows. But, given that that's a bug, the solution should be to fix Linux and not to encourage vendors to put workarounds in their firmware. That punishes vendors which actually care about Linux. These are quite rare in the laptop and desktop market, but they do exist. It doesn't punish them. They're the ones who are going to work with us to ensure that Linux works on their hardware, and their needs are going to be prioritised over those of vendors who don't care about Linux. The other choice (encourage vendors to put workarounds in their firmware instead) *does* punish users who've ended up with laptops that aren't actively supported - like, say, pretty much anything on the market. -- Matthew Garrett | [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wednesday, February 20, 2008 11:10 am Jeff Chua wrote: On Feb 21, 2008 2:53 AM, Jesse Barnes [EMAIL PROTECTED] wrote: So, next I'll try shutdown to see if it work. I was using platform. Ok, that would be good to try. shutdown does power down properly. But still green on resume. Ok, so Linus' theory about something later in the resume path trying to touch video is looking good. Rafael, is there anyway to prevent the device shutdown in the hibernate path? Looks like the AR registers are hosed, which is what I thought I fixed... Can you attach your i915_drv.c file just so I can sanity check it? Attached. Hm, looks right. Let me see if I can reproduce this on my T61. Thanks, Jesse - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wednesday, February 20, 2008 11:18 am Jesse Barnes wrote: On Wednesday, February 20, 2008 11:10 am Jeff Chua wrote: On Feb 21, 2008 2:53 AM, Jesse Barnes [EMAIL PROTECTED] wrote: So, next I'll try shutdown to see if it work. I was using platform. Ok, that would be good to try. shutdown does power down properly. But still green on resume. Ok, so Linus' theory about something later in the resume path trying to touch video is looking good. Rafael, is there anyway to prevent the device shutdown in the hibernate path? Given the way the PM core works, do we need to set a flag like this? I really hope there's a better way of doing this... Thanks, Jesse diff --git a/drivers/char/drm/i915_drv.c b/drivers/char/drm/i915_drv.c index 4048f39..a2d6242 100644 --- a/drivers/char/drm/i915_drv.c +++ b/drivers/char/drm/i915_drv.c @@ -238,6 +238,13 @@ static void i915_restore_vga(struct drm_device *dev) } +/* + * If we're doing a suspend to disk, we don't want to power off the device. + * Unfortunately, the PM core doesn't tell us if we're headed for a regular + * S3 state or that it's about to shut down the machine, so we use this flag. + */ +static int i915_hibernate; + static int i915_suspend(struct drm_device *dev, pm_message_t state) { struct drm_i915_private *dev_priv = dev-dev_private; @@ -252,6 +259,9 @@ static int i915_suspend(struct drm_device *dev, pm_message_t state) if (state.event == PM_EVENT_PRETHAW) return 0; + if (state.event == PM_EVENT_FREEZE) + i915_hibernate = 1; + pci_save_state(dev-pdev); pci_read_config_byte(dev-pdev, LBB, dev_priv-saveLBB); @@ -366,7 +376,7 @@ static int i915_suspend(struct drm_device *dev, pm_message_t state) i915_save_vga(dev); - if (state.event == PM_EVENT_SUSPEND) { + if (!i915_hibernate) { /* Shut down the device */ pci_disable_device(dev-pdev); pci_set_power_state(dev-pdev, PCI_D3hot); @@ -385,6 +395,8 @@ static int i915_resume(struct drm_device *dev) if (pci_enable_device(dev-pdev)) return -1; + i915_hibernate = 0; + pci_write_config_byte(dev-pdev, LBB, dev_priv-saveLBB); /* Pipe plane A info */ - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wednesday, 20 of February 2008, Jesse Barnes wrote: On Wednesday, February 20, 2008 11:18 am Jesse Barnes wrote: On Wednesday, February 20, 2008 11:10 am Jeff Chua wrote: On Feb 21, 2008 2:53 AM, Jesse Barnes [EMAIL PROTECTED] wrote: So, next I'll try shutdown to see if it work. I was using platform. Ok, that would be good to try. shutdown does power down properly. But still green on resume. Ok, so Linus' theory about something later in the resume path trying to touch video is looking good. Rafael, is there anyway to prevent the device shutdown in the hibernate path? Given the way the PM core works, do we need to set a flag like this? I really hope there's a better way of doing this... I think we should export the target sleep state somehow. diff --git a/drivers/char/drm/i915_drv.c b/drivers/char/drm/i915_drv.c index 4048f39..a2d6242 100644 --- a/drivers/char/drm/i915_drv.c +++ b/drivers/char/drm/i915_drv.c @@ -238,6 +238,13 @@ static void i915_restore_vga(struct drm_device *dev) } +/* + * If we're doing a suspend to disk, we don't want to power off the device. + * Unfortunately, the PM core doesn't tell us if we're headed for a regular + * S3 state or that it's about to shut down the machine, so we use this flag. + */ +static int i915_hibernate; + static int i915_suspend(struct drm_device *dev, pm_message_t state) { struct drm_i915_private *dev_priv = dev-dev_private; @@ -252,6 +259,9 @@ static int i915_suspend(struct drm_device *dev, pm_message_t state) if (state.event == PM_EVENT_PRETHAW) return 0; + if (state.event == PM_EVENT_FREEZE) + i915_hibernate = 1; + pci_save_state(dev-pdev); pci_read_config_byte(dev-pdev, LBB, dev_priv-saveLBB); @@ -366,7 +376,7 @@ static int i915_suspend(struct drm_device *dev, pm_message_t state) i915_save_vga(dev); - if (state.event == PM_EVENT_SUSPEND) { + if (!i915_hibernate) { /* Shut down the device */ pci_disable_device(dev-pdev); pci_set_power_state(dev-pdev, PCI_D3hot); @@ -385,6 +395,8 @@ static int i915_resume(struct drm_device *dev) if (pci_enable_device(dev-pdev)) return -1; + i915_hibernate = 0; + pci_write_config_byte(dev-pdev, LBB, dev_priv-saveLBB); /* Pipe plane A info */ Then, the .resume() called after the image creation will clear the flag and I don't think it's safe to allow it to survive i915_resume() ... Thanks, Rafael - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
module bay on dell d830
Hi, I have a Dell D830 (has module bay). I've tried searching the web for the answer to should I, with 2.6.24.2 (and latest Dell BIOS) be able to remove the cdrom bay drive and continue to suspend/resume. I kind of don't think I should be able to yet, but I thought I'd ask here. So: How do I (or can I) remove the cdrom from the acpi configuration (for lack if a better term) so that I can replace the cdrom drive in the module bay with a battery and continue to sleep/resume? Currently, I can 1) remove the ide_cd and cdrom modules 2) echo 1 |/sys/devices/platform/bay.0/eject # I do not see that this does any thing. 3) physically remove the cdrom But, if I try to sleep, the sleep process seems to hang. If after waiting for several 10's of seconds, I physically re-insert the cdrom module, the system will finish the sleep process and go to sleep. Note: /sys/devices/platform/bay.0 information seems to mirror the information in /sys/devices/platform/bay.1. For example, before and after the cdrom is physically removed, both /sys/.../bay.0/present and /sys/.../bay.1/present show the same information. Does anyone know how or if I can do this? What info do I need to provide? (attached is ACPI message from /var/log/dmesg) output of acpidump is at fnapcf.fnal.gov/~ron/acpidump-d830-2.6.24.2.txt The following processes are running. # ps aux | grep '[a]cpi' root68 0.0 0.0 0 0 ?S Feb16 0:00 [kacpid] root69 0.0 0.0 0 0 ?S Feb16 0:00 [kacpi_notify] root 3730 0.0 0.0 3768 628 ?Ss Feb16 0:00 /usr/sbin/acpid 684028 0.0 0.0 12272 852 ?SFeb16 0:00 hald-addon-acpi: listening on acpid socket /var/run/acpid.socket I have the following modules (among others, of course): # lsmod | egrep 'cdrom|ide_|bay' ide_cd 44832 0 cdrom 39592 1 ide_cd bay11392 0 dock 16416 1 bay Could I write a script that echo's something to some file in /sys/devices/platform/bay.1/eject and unloads modules? Then could/should I adjust the files under /etc/acpi/ to do some sort of reconfig/rescan?? after I get the ACPI: \_SB_.PCI0.IDE1.PRI_.MAST: Bay event which appears in dmesg, but not in /var/log/acpid (which is most likely because I do not have anything configured in /etc/acpi/events. I have a dell latitude d830 with BIOS A08 (the latest). I'm running linux-2.6.24.2 x86_64. My distribution is Scientific Linux 5 (based on RHEL5). I have the latest nvidia driver and can suspend to ram and resume. BTW, thanks to all who have made this possible! I really feel that my productivity at work is maximized when all the information I keep on my 24 virtual desktops can be maintained across many many suspend/resume cycles. I did buy an 80 Gig. module bay hard disk tha I would eventually like to get working under linux also. I apologize if I missed the answer to this before. I really have tried to search the web for the answer and have failed. Thanks, Ron ACPI: RSDP 000FBB00, 0024 (r2 DELL ) ACPI: XSDT DFE5D200, 0064 (r1 DELLM08 27D8010E ASL61) ACPI: FACP DFE5D09C, 00F4 (r4 DELLM08 27D8010E ASL61) ACPI: DSDT DFE5D800, 63F7 (r2 INT430 SYSFexxx 1001 INTL 20050624) ACPI: FACS DFE6C000, 0040 ACPI: HPET DFE5D300, 0038 (r1 DELLM081 ASL61) ACPI: APIC DFE5D400, 0068 (r1 DELLM08 27D8010E ASL47) ACPI: ASF! DFE5D000, 007E (r32 DELLM08 27D8010E ASL61) ACPI: MCFG DFE5D3C0, 003E (r16 DELLM08 27D8010E ASL61) ACPI: SLIC DFE5D49C, 0176 (r1 DELLM08 27D8010E ASL61) ACPI: TCPA DFE5D700, 0032 (r10 ASL 0) ACPI: SSDT DFE5B97E, 04CC (r1 PmRefCpuPm 3000 INTL 20050624) ACPI: DMI detected: Dell ACPI: PM-Timer IO Port: 0x1008 ACPI: Local APIC address 0xfee0 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled) ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) ACPI: IOAPIC (id[0x02] address[0xfec0] gsi_base[0]) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. ACPI: HPET id: 0x8086a201 base: 0xfed0 Using ACPI (MADT) for SMP configuration information ACPI: Core revision 20070126 ACPI: bus type pci registered ACPI: EC: Look up EC in DSDT ACPI: BIOS _OSI(Linux) query ignored via DMI ACPI: If acpi_osi=Linux works better, please notify linux-acpi@vger.kernel.org ACPI: SSDT DFE6C080, 0043 (r1 LMPWR DELLLOM 1001 INTL 20050624) ACPI: Interpreter enabled ACPI: (supports S0 S3 S4 S5) ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (:00) PCI quirk: region 1000-107f claimed by ICH6 ACPI/GPIO/TCO ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wed, 20 Feb 2008, Rafael J. Wysocki wrote: I think we should export the target sleep state somehow. Yeah. By *not* using -suspend() for freezing or hibernate. Please, Rafael - just make the f*cking suspend-to-disk use other routines already. 99% of all hardware needs to do exactly *nothing* on suspend-to-disk, and the ones that really do need things tend to need to not do a whole lot. For example, the freeze action for USB (which is one of the hardest things to suspend) should literally be something like just setting the controller STOP bit, and waiting for it to have stopped. The unfreeze should be to just clear the stop bit, while the restart should be just a controller reset to use the current memory image. NONE OF THIS HAS ABSOLUTELY ANYTHING TO DO WITH SUSPEND. It never did. I've told people so for years. Maybe actually seeing the problems will make people realize. So please, we shouldn't call -suspend[_late] or -resume[_early] at all. Not with PMSG_FREEZE, not with PMSG_*anything*. Can we please get this fixed some day? Linus - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wednesday, February 20, 2008 12:29 pm Linus Torvalds wrote: On Wed, 20 Feb 2008, Rafael J. Wysocki wrote: I think we should export the target sleep state somehow. Yeah. By *not* using -suspend() for freezing or hibernate. Please, Rafael - just make the f*cking suspend-to-disk use other routines already. 99% of all hardware needs to do exactly *nothing* on suspend-to-disk, and the ones that really do need things tend to need to not do a whole lot. In talking with Rafael on IRC about this, I think we're agreed that we need separate entry points. Even with a kexec based hibernate, we'll probably want -hibernate callbacks so we don't end up shutting down the device. The current callback system looks like this (according to Rafael and the last time I looked): -suspend(PMSG_FREEZE) -resume() -suspend(PMSG_SUSPEND) *enter S3 or power off* -resume() The fact that we get suspend/resume called once before suspend again in the hibernate case is somewhat obnoxious, but it's even worse that we don't know what we're about to enter after -suspend(PMSG_SUSPEND). So in the short term it would be nice to at least get the target state exported. And in the long term we could have: -suspend() *enter S3* -resume() or: -hibernate() *kexec to another kernel to save image* *power off* -return_from_hibernate() (or somesuch) Jesse - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Suspend-devel] 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wednesday 20 February 2008 at 3:29 pm, Linus Torvalds penned about Re: [Suspend-devel] 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green. Can we please get this fixed some day? I can't say I even come close to understand what's going on but getting s2ram to work on my Dell M4300 has been a nightmare. Even after writing up how to get it to work (posted on the suspend-devel list - but no one answered .. yet again), I'm having some quirks. If I had a bizillion $'s, I'd buy an M4300 for Linus and give him a million to get it to s2ram! :p Cheers, -- Pablo Sanchez - Blueoak Database Engineering, Inc Ph:819.459.1926 Toll free: 888.459.1926 Fax: 603.720.7723 (US) Text Page: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wed, 20 Feb 2008, Jesse Barnes wrote: The current callback system looks like this (according to Rafael and the last time I looked): -suspend(PMSG_FREEZE) -resume() -suspend(PMSG_SUSPEND) *enter S3 or power off* -resume() Yes, it's very messy. It's messy for a few different reasons: - the one you hit: a driver actually has a really hard time telling what PMSG_SUSPEND really means. - more importantly, we generally don't want to suspend/resume the hardware at all around a power-off, because we're going to resume with the state at the time of the PMSG_FREEZE, which means that the hardware has actually *changed* and been used in between! that second case is very fundamental for things like USB devices, which in theory you can hold alive over a real suspend event (ie a STR event), but which absolutely MUST NOT be resumed over a suspend-to-disk event, because all the low-level request state is bogus! So the -resume really isn't a resume at all. It's much closer to a -reset. Of course, the solution to this all right now is that we have to reset everything even if it *is* a suspend event, so it basically means that STR ends up using the much weaker model that snapshot-to-disk uses. The fundamental problem being that the two really have nothing what-so-ever to do with each other. They aren't even similar. Never were. And in the long term we could have: -suspend() *enter S3* -resume() Yes, apart from all the complexities (suspend_late/resume_early). So in reality it's more than that, but the suspend/resume things are clearly nesting, and they have the potential to actually keep state around (because we *know* this machine is not going to mess with the devices in between). IOW, here we actually can have as an option assume the device is there when you return. or: -hibernate() *kexec to another kernel to save image* *power off* -return_from_hibernate() (or somesuch) Enough people don't trust kexec that I suspect the right thing simply is -freeze() // stop dma, synchronize device state *snapshot* -unfreeze(); // resume dma *save image* [ optionally -poweroff() ] // do we really care? I'd say no *power off* -restore() // reset device to the frozen one which may have four entry-points that can be illogically mapped to the suspend/resume ones like we do now, but they really have nothing to do with suspending/resuming. And notice how while freeze/restore kind of pairs like a suspend/resume, it really shouldn't be expected to realistically restore the same state at all. The restore part is generally much better seen as a reset hardware than a resume thing. Because we literally cannot trust *anything* about the state since we froze it - we might have booted a different OS in between etc. Very different from suspend/resume. Linus - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Machine shutdown after resume from S3
I have here an Intel Classmate hardware sample, and I have a weird problem with suspend to ram, the machine does a power off when resuming. I isolated the problem to the button acpi module, without loading it (or just removing it before doing a s2ram) I don't get the problem. In the specific machine I have here I can only resume it pressing the power button, so I think this is related. I started looking into the kernel code and did some tests. The first thing I tried just as a test was to disable the code in acpi_button_notify function. As expected it stopped to send the power button key events to /proc/acpi/event, but I still got the same s2ram issues. But if I disable acpi_install_fixed_event_handler calls in acpi_button_install_notify_handlers the power off issue in s2ram was gone, of course also with power button not notifying anything anymore :), but this was just a test. After this tests then I went further to try to track down the problem and I saw acpi_ev_fixed_event_dispatch, that is the function that will call acpi_button_notify_fixed. First thing I noted: the comment about acpi_ev_fixed_event_dispatch says it will return INTERRUPT_HANDLED or INTERRUPT_NOT_HANDLED, but acpi_button_notify_fixed return AE_OK, is this right (comment is outdated) or am I missing something? Anyway I changed AE_OK to ACPI_INTERRUPT_HANDLED but this didn't change nothing. In the end I stopped there, doesn't seem to be anything wrong with the code at all, I also took a look at acpi_ev_fixed_event_detect and other code related to the table of fixed events (acpi_gbl_fixed_event_handlers), but didn't got more clues. Could be this a bios issue, or there is some hints to what I can try to look and prove that it's bios or code related? PS.: with netconsole I don't get any message before power off after resume, I tried it to get more hints. -- []'s Herton - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wednesday, February 20, 2008 11:10 am Jeff Chua wrote: On Feb 21, 2008 2:53 AM, Jesse Barnes [EMAIL PROTECTED] wrote: So, next I'll try shutdown to see if it work. I was using platform. Ok, that would be good to try. shutdown does power down properly. But still green on resume. Looks like the AR registers are hosed, which is what I thought I fixed... Can you attach your i915_drv.c file just so I can sanity check it? Attached. Ok, can you give this patch a try with the 'platform' method? It should at least tell us what ACPI would like the device to do at suspend time, but it probably won't fix the hang. Thanks, Jesse diff --git a/drivers/char/drm/i915_drv.c b/drivers/char/drm/i915_drv.c index 4048f39..d8aa2c9 100644 --- a/drivers/char/drm/i915_drv.c +++ b/drivers/char/drm/i915_drv.c @@ -366,11 +366,11 @@ static int i915_suspend(struct drm_device *dev, pm_message_t state) i915_save_vga(dev); - if (state.event == PM_EVENT_SUSPEND) { - /* Shut down the device */ - pci_disable_device(dev-pdev); - pci_set_power_state(dev-pdev, PCI_D3hot); - } + /* Ask ACPI which state the device should be put in */ + pci_disable_device(dev-pdev); + printk(calling pci_set_power_state with %d\n, + acpi_pci_choose_state(dev, state)); + pci_set_power_state(dev-pdev, acpi_pci_choose_state(dev, state)); return 0; } @@ -380,7 +380,7 @@ static int i915_resume(struct drm_device *dev) struct drm_i915_private *dev_priv = dev-dev_private; int i; - pci_set_power_state(dev-pdev, PCI_D0); + pci_set_power_state(dev-pdev, acpi_pci_choose_state(dev, state)); pci_restore_state(dev-pdev); if (pci_enable_device(dev-pdev)) return -1; - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wednesday, February 20, 2008 1:13 pm Linus Torvalds wrote: On Wed, 20 Feb 2008, Jesse Barnes wrote: The current callback system looks like this (according to Rafael and the last time I looked): -suspend(PMSG_FREEZE) -resume() -suspend(PMSG_SUSPEND) *enter S3 or power off* -resume() Yes, it's very messy. It's messy for a few different reasons: - the one you hit: a driver actually has a really hard time telling what PMSG_SUSPEND really means. - more importantly, we generally don't want to suspend/resume the hardware at all around a power-off, because we're going to resume with the state at the time of the PMSG_FREEZE, which means that the hardware has actually *changed* and been used in between! Exactly. So the -resume really isn't a resume at all. It's much closer to a -reset. Yeah, in the hibernate case this is definitely true. Of course, the solution to this all right now is that we have to reset everything even if it *is* a suspend event, so it basically means that STR ends up using the much weaker model that snapshot-to-disk uses. The fundamental problem being that the two really have nothing what-so-ever to do with each other. They aren't even similar. Never were. And in the long term we could have: -suspend() *enter S3* -resume() Yes, apart from all the complexities (suspend_late/resume_early). So in reality it's more than that, but the suspend/resume things are clearly nesting, and they have the potential to actually keep state around (because we *know* this machine is not going to mess with the devices in between). Really, in the simple s3 case we still need early/late stuff? IOW, here we actually can have as an option assume the device is there when you return. or: -hibernate() *kexec to another kernel to save image* *power off* -return_from_hibernate() (or somesuch) Enough people don't trust kexec that I suspect the right thing simply is -freeze() // stop dma, synchronize device state *snapshot* -unfreeze(); // resume dma *save image* [ optionally -poweroff() ] // do we really care? I'd say no *power off* -restore() // reset device to the frozen one which may have four entry-points that can be illogically mapped to the suspend/resume ones like we do now, but they really have nothing to do with suspending/resuming. Well, it seems like we'll have to fix drivers in either case, and isn't a kexec approach fundamentally more sound and simple, design-wise? Rafael pointed out some problems with properly setting wakeup states, but I think that could be overcome... And notice how while freeze/restore kind of pairs like a suspend/resume, it really shouldn't be expected to realistically restore the same state at all. The restore part is generally much better seen as a reset hardware than a resume thing. Because we literally cannot trust *anything* about the state since we froze it - we might have booted a different OS in between etc. Very different from suspend/resume. Yeah, definitely. It has to be much more robust and deal with configuration changes, etc. (within reason). Jesse - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Suspend-devel] 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
Rafael J. Wysocki wrote: On Wednesday, 20 of February 2008, Linus Torvalds wrote: On Wed, 20 Feb 2008, Rafael J. Wysocki wrote: I think we should export the target sleep state somehow. Yeah. By *not* using -suspend() for freezing or hibernate. Please, Rafael - just make the f*cking suspend-to-disk use other routines already. Okay, I think I'll just start sending patches for that, but rather not earlier than in the 2.6.27 time frame. No one else works on that and I've been busy with other things recently. Besides, I'm not even a full time kernel developer ... Rafael, If I can help, please say so. Regards, Alex. 99% of all hardware needs to do exactly *nothing* on suspend-to-disk, and the ones that really do need things tend to need to not do a whole lot. For example, the freeze action for USB (which is one of the hardest things to suspend) should literally be something like just setting the controller STOP bit, and waiting for it to have stopped. The unfreeze should be to just clear the stop bit, while the restart should be just a controller reset to use the current memory image. NONE OF THIS HAS ABSOLUTELY ANYTHING TO DO WITH SUSPEND. It never did. I've told people so for years. Maybe actually seeing the problems will make people realize. I think so. So please, we shouldn't call -suspend[_late] or -resume[_early] at all. Not with PMSG_FREEZE, not with PMSG_*anything*. Can we please get this fixed some day? Yes, we can (hopefully). Thanks, Rafael - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
patch pm-remove-unbalanced-mutex_unlock-from-dpm_resume.patch added to gregkh-2.6 tree
This is a note to let you know that I've just added the patch titled Subject: PM: Remove unbalanced mutex_unlock() from dpm_resume() to my gregkh-2.6 tree. Its filename is pm-remove-unbalanced-mutex_unlock-from-dpm_resume.patch This tree can be found at http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/ From [EMAIL PROTECTED] Wed Feb 20 12:56:19 2008 From: Rafael J. Wysocki [EMAIL PROTECTED] Date: Wed, 20 Feb 2008 02:01:41 +0100 Subject: PM: Remove unbalanced mutex_unlock() from dpm_resume() To: Greg KH [EMAIL PROTECTED] Cc: ACPI Devel Maling List linux-acpi@vger.kernel.org, Alan Stern [EMAIL PROTECTED], Len Brown [EMAIL PROTECTED], Linux-pm mailing list [EMAIL PROTECTED], LKML [EMAIL PROTECTED], Pavel Machek [EMAIL PROTECTED] Message-ID: [EMAIL PROTECTED] Content-Disposition: inline From: Rafael J. Wysocki [EMAIL PROTECTED] Remove an unnecessary unlocking of dpm_list_mtx in the error path in drivers/base/power/main.c:dpm_suspend() . Signed-off-by: Rafael J. Wysocki [EMAIL PROTECTED] Acked-by: Alan Stern [EMAIL PROTECTED] Signed-off-by: Greg Kroah-Hartman [EMAIL PROTECTED] --- drivers/base/power/main.c |1 - 1 file changed, 1 deletion(-) --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -479,7 +479,6 @@ static int dpm_suspend(pm_message_t stat mutex_lock(dpm_list_mtx); if (list_empty(dev-power.entry)) list_add(dev-power.entry, dpm_locked); - mutex_unlock(dpm_list_mtx); break; } mutex_lock(dpm_list_mtx); Patches currently in gregkh-2.6 which might be from [EMAIL PROTECTED] are driver/driver-core-pm-make-suspend_device-static.patch driver/pm-remove-unbalanced-mutex_unlock-from-dpm_resume.patch - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wed, 20 Feb 2008, Jesse Barnes wrote: Really, in the simple s3 case we still need early/late stuff? Absolutely. Two big reasons: - debuggability I know we don't do this correctly right now, but I want to be able to at least feel like we can some day actually do printk's etc through 99% of the suspend/resume cycle. It's a *huge* thing for debugging problems that happen in the wild, and one of the biggest issues is that we currently usualyl just get a the machine died message when suspend or resume doesn't work. Yes, doing printk's to the Intel management flash stuff can help a lot here, and I want that too, but I'd really like to shut down consoles individually rather than having the big hammer approach that shuts them up entirely over the whole suspend/resume sequence (or not at all, if you use no_console_suspend). And I'd *really* like to do things like VGA-console shutdown in the late phase (and resume early). - it's actually likely *much* simpler for some devices. Simple devices (and that includes things like PCI bridges etc, but also potentially USB host controllers etc) are things that can often be trivially suspended - all the complexity is really not in the controller itself, but beyond, in the bus that it actually drives. And the late-suspend/early-resume means that you don't have to worry about things like interrupts happening while you're suspended. Yes, putting the device into D3 will disable interrupts from that device too (unless there are bugs), *BUT* you may be sharing an interrupt line, and interrupts may be posted and delayed, so an earlier interrupt may well be pending etc. suspending late and resuming early just avoids those issues entirely. Sometimes these things interact. For example, firewire is certainly not trivial to suspend as a subsystem thing (ie all the devices behind the firewire bridge need to do magic things, like spinning down etc that obviously can not happen in the final late phase), but the firewire controller itself is likely trivial to suspend/resume and can easily be handled in the late/early routines. And guess what? It's also exactly what you want to happen in case you end up using the firewire RDMA as a debug aid. IOW, you want that firewire controller (and the PCI bridges) working really early, so that if a problem does happen when you resume some more complex device (say, one of the graphics chips that need X to really come alive), you can use the firewire rdma to read out the kernel log buffer from memory. Well, it seems like we'll have to fix drivers in either case, and isn't a kexec approach fundamentally more sound and simple, design-wise? Rafael pointed out some problems with properly setting wakeup states, but I think that could be overcome... I don't personally mind kexec at all, but on the other hand, I don't care about suspend-to-disk in the first place. I do know that some people really don't want it, and I suspect that they have valid reasons. Ranging from memory use to simply just performance. Linus - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wednesday, 20 of February 2008, Jesse Barnes wrote: On Wednesday, February 20, 2008 1:13 pm Linus Torvalds wrote: On Wed, 20 Feb 2008, Jesse Barnes wrote: The current callback system looks like this (according to Rafael and the last time I looked): -suspend(PMSG_FREEZE) -resume() -suspend(PMSG_SUSPEND) *enter S3 or power off* -resume() Yes, it's very messy. It's messy for a few different reasons: - the one you hit: a driver actually has a really hard time telling what PMSG_SUSPEND really means. In fact the driver can find out in which state to put the device into, depending on the target ACPI state which is known. - more importantly, we generally don't want to suspend/resume the hardware at all around a power-off, because we're going to resume with the state at the time of the PMSG_FREEZE, which means that the hardware has actually *changed* and been used in between! Exactly. So the -resume really isn't a resume at all. It's much closer to a -reset. Yeah, in the hibernate case this is definitely true. Agreed. Of course, the solution to this all right now is that we have to reset everything even if it *is* a suspend event, so it basically means that STR ends up using the much weaker model that snapshot-to-disk uses. The fundamental problem being that the two really have nothing what-so-ever to do with each other. They aren't even similar. Never were. And in the long term we could have: -suspend() *enter S3* -resume() Yes, apart from all the complexities (suspend_late/resume_early). So in reality it's more than that, but the suspend/resume things are clearly nesting, and they have the potential to actually keep state around (because we *know* this machine is not going to mess with the devices in between). Really, in the simple s3 case we still need early/late stuff? Yes, we do. There are devices that need to be suspended with interrupts off. IOW, here we actually can have as an option assume the device is there when you return. That is, unless the user pulls out that pendrive while suspended, no? or: -hibernate() *kexec to another kernel to save image* *power off* -return_from_hibernate() (or somesuch) Enough people don't trust kexec that I suspect the right thing simply is -freeze() // stop dma, synchronize device state *snapshot* -unfreeze(); // resume dma *save image* [ optionally -poweroff() ] // do we really care? I'd say no We do, if there are devices that wake us up from S4 and don't wake us up from S5, for example. Plus this f*cking fan in my box that doesn't work after the resume if we don't do -poweroff() ... *power off* -restore() // reset device to the frozen one which may have four entry-points that can be illogically mapped to the suspend/resume ones like we do now, but they really have nothing to do with suspending/resuming. Apart from putting devices into the right low power states, that is. Well, it seems like we'll have to fix drivers in either case, and isn't a kexec approach fundamentally more sound and simple, design-wise? Rafael pointed out some problems with properly setting wakeup states, but I think that could be overcome... Your honor, I would like to register a differing opinion ... And notice how while freeze/restore kind of pairs like a suspend/resume, it really shouldn't be expected to realistically restore the same state at all. The restore part is generally much better seen as a reset hardware than a resume thing. That's absolutely correct. Because we literally cannot trust *anything* about the state since we froze it - we might have booted a different OS in between etc. Very different from suspend/resume. Yeah, definitely. It has to be much more robust and deal with configuration changes, etc. (within reason). Agreed. Thanks, Rafael - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wednesday, February 20, 2008 11:10 am Jeff Chua wrote: On Feb 21, 2008 2:53 AM, Jesse Barnes [EMAIL PROTECTED] wrote: So, next I'll try shutdown to see if it work. I was using platform. Ok, that would be good to try. shutdown does power down properly. But still green on resume. Looks like the AR registers are hosed, which is what I thought I fixed... Can you attach your i915_drv.c file just so I can sanity check it? Attached. Jeff, for the hang on suspend problem, I know suspect something else in 2.6.25-rc2 caused that. Can you try the 2.6.25-rc1 version of i915_drv.c (in fact all of drivers/char/drm from 2.6.25-rc1) but in a 2.6.25-rc2 kernel? I ask because 2.6.25-rc1 suspends to disk just fine for me and resumes w/o a green screen, while 2.6.25-rc2 fails to suspend (hangs like you say) and gives me a green screen. Were there other changes in ACPI or the PM core that might have caused this I wonder? Thanks, Jesse - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
Hi. Jesse Barnes wrote: Well, it seems like we'll have to fix drivers in either case, and isn't a kexec approach fundamentally more sound and simple, design-wise? Rafael pointed out some problems with properly setting wakeup states, but I think that could be overcome... No. AFAICS, kexec is going to be more complex and ugly in many ways. To summarise, a kexec based hibernation is going to need the following additional requirements to just replace what we already have: - get the original kernel to allocate storage while racing against the rest of the system (currently allocation is done post-atomic copy post-freezing - no racing). This makes it potentially slower, too; - get the original kernel to transfer the information about what swap was allocated to the kexec'd kernel, probably together with a lot of other information (which pages are nosave etc). - get the original kernel to keep memory free for the kexec'd kernel which would otherwise be usable. Not a biggy on desktops or laptops, but think about embedded. - people keep talking about hibernating to an ext3 fs mounted on fuse as a limitation of the freezer. To do that with kexec, you're still going to have to bmap the ext3 fs and pass the block list (in which case we can also do it without kexec) or umount all the ext3/fuse part and remount in the kexec'd kernel. Sort of defeats the purpose, doesn't it? I also wonder about how much of a pain it's going to be setting up userspace for this kexec'd kernel. Will you need a separate partition just for it? If not, will the userspace be loaded into memory all the time (more memory wasted for normal use), or loaded from ordinary partitions at kexec time (how to do safely? - more info to transfer between kernels?). I'd love it if kexec really was the panacea to the freezer issues, but problems like these make me think it isn't a viable solution. Nigel - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wednesday, February 20, 2008 2:32 pm Jesse Barnes wrote: On Wednesday, February 20, 2008 11:10 am Jeff Chua wrote: On Feb 21, 2008 2:53 AM, Jesse Barnes [EMAIL PROTECTED] wrote: So, next I'll try shutdown to see if it work. I was using platform. Ok, that would be good to try. shutdown does power down properly. But still green on resume. Looks like the AR registers are hosed, which is what I thought I fixed... Can you attach your i915_drv.c file just so I can sanity check it? Attached. Jeff, for the hang on suspend problem, I know suspect something else in 2.6.25-rc2 caused that. Looks like 2.6.25-rc1 also had broken suspend (my test was broken). IIRC, Dave and I had it working at LCA using the out of tree DRM modules on 2.6.23.14 or 15... Maybe you could give that a try? Thanks, Jesse - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wed, 20 Feb 2008, Rafael J. Wysocki wrote: which may have four entry-points that can be illogically mapped to the suspend/resume ones like we do now, but they really have nothing to do with suspending/resuming. Apart from putting devices into the right low power states, that is. And by right low power states you mean wrong low-power states, right? The thing is, they really *are* the wrong states for 99% of all hardware. If you really have a piece of hardware that you want to have the -poweroff() thing do the same as -suspend(), then hey, just use the same function (or better yet, use two different functions with a call to a shared part). Because IT IS NOT TRUE that -suspend() puts the devices in the right power state. The power states are likely to be totally different for S3 and for poweroff, and they are going to differ in different ways depending on the device type. One example would be the one that started this version of the whole discussion (shock horror! We're on subject!) ie when you do a system shutdown, you generally do not even *want* to put individual devices into low-power states at all, because the actual power off the system thing will take care of it for you much better. So to take just something as simple as VGA as an example: you really do not want to suspend that device, because you want to see the poweroff messages until the very end. So that final device -poweroff function really has absolutely *nothing* in common with the device -suspend[_late] functions, simply because almost any sane driver would decide to do different things. Of course, we can continue to do the insane thing and just continue to use inappropriate and misleadign function callback names, and then encodign what the *real* action should be in the argument and/or in magic system-wide state parameters. So in that sense, it's certainly totally the same thing whether we call it -shutdown or -poweroff or -eat_a_banana, since you could always just look at the argument and other clues, and decide that *this* time, for *this* kind of device, the eat a banana callback actually means that we should power it off, but wouldn't it be a lot more logical to just make it clear in the first place that they aren't called for the same reason at all? I'd claim that it's much easier for everybody (and _especially_ for device driver writers) to have static int my_shutdown(struct pci_device *dev, int state) { .. do something .. } static int my_suspend(struct pci_device *dev, int state) { .. do something .. } ... .shutdown = my_shutdown, .suspend = my_suspend, ... than to have static int my_suspend(struct pci_device *dev, state) { .. common code .. if (state == XYZZY) ..special code.. else ..other case code.. } ... .suspend = my_suspend ... even if the latter might be fewer lines. It doesn't really matter if it's fewer, does it, if the alternate version is more obvious about what it does? The other issue is that I've long wanted to make sure that when people fix suspend-to-ram, they don't screw up suspend-to-disk by mistake and vice versa. When a driver writer makes changes, he shouldn't have the kind of illogical oops, unintended consequences issues in general. It should be pretty damn obvious when he changes suspend code vs when he changes snapshot/restore code. We've somewhat untangled that on the core kernel layer, but we've left the driver confusion alone. Linus - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
ACPI: DMI Product Name: OptiPlex 330
Initializing cgroup subsys cpuset Linux version 2.6.24-1-amd64 (Debian 2.6.24-4~mtu1) ([EMAIL PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (D ebian 4.1.1-21)) #1 SMP Fri Feb 15 11:09:50 JST 2008 Command line: root=/dev/md0 ro BIOS-provided physical RAM map: BIOS-e820: - 0009fc00 (usable) BIOS-e820: 000f - 0010 (reserved) BIOS-e820: 0010 - 7f55ac00 (usable) BIOS-e820: 7f55ac00 - 7f55ec00 (ACPI NVS) BIOS-e820: 7f55ec00 - 7f560c00 (ACPI data) BIOS-e820: 7f560c00 - 8000 (reserved) BIOS-e820: e000 - f000 (reserved) BIOS-e820: fec0 - fed00400 (reserved) BIOS-e820: fed2 - feda (reserved) BIOS-e820: fee0 - fef0 (reserved) BIOS-e820: ffb0 - 0001 (reserved) Entering add_active_range(0, 0, 159) 0 entries of 3200 used Entering add_active_range(0, 256, 521562) 1 entries of 3200 used end_pfn_map = 1048576 DMI 2.5 present. ACPI: RSDP 000FEBF0, 0024 (r2 DELL ) ACPI: XSDT 000FD033, 0064 (r1 DELLB9KD ASL61) ACPI: FACP 000FD153, 00F4 (r3 DELLB9KD ASL61) ACPI: DSDT FFF5DB8E, 3315 (r1 DELLdt_ex 1000 INTL 20050624) ACPI: FACS 7F55AC00, 0040 ACPI: SSDT FFF60FC4, 00AC (r1 DELLst_ex 1000 INTL 20050624) ACPI: APIC 000FD247, 0092 (r1 DELLB9KD ASL61) ACPI: BOOT 000FD2D9, 0028 (r1 DELLB9KD ASL61) ACPI: ASF! 000FD301, 0092 (r32 DELLB9KD ASL61) ACPI: MCFG 000FD393, 003E (r1 DELLB9KD ASL61) ACPI: HPET 000FD3D1, 0038 (r1 DELLB9KD ASL61) ACPI: SLIC 000FD409, 00C0 (r1 DELLB9KD ASL61) No NUMA configuration found Faking a node at -7f55a000 Entering add_active_range(0, 0, 159) 0 entries of 3200 used Entering add_active_range(0, 256, 521562) 1 entries of 3200 used Bootmem setup node 0 -7f55a000 Zone PFN ranges: DMA 0 - 4096 DMA324096 - 1048576 Normal1048576 - 1048576 Movable zone start PFN for each node early_node_map[2] active PFN ranges 0:0 - 159 0: 256 - 521562 On node 0 totalpages: 521465 DMA zone: 56 pages used for memmap DMA zone: 1068 pages reserved DMA zone: 2875 pages, LIFO batch:0 DMA32 zone: 7074 pages used for memmap DMA32 zone: 510392 pages, LIFO batch:31 Normal zone: 0 pages used for memmap Movable zone: 0 pages used for memmap ACPI: PM-Timer IO Port: 0x808 ACPI: Local APIC address 0xfee0 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 (Bootup-CPU) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] disabled) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x05] disabled) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x07] disabled) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x00] disabled) ACPI: LAPIC (acpi_id[0x06] lapic_id[0x01] disabled) ACPI: LAPIC (acpi_id[0x07] lapic_id[0x02] disabled) ACPI: LAPIC (acpi_id[0x08] lapic_id[0x03] disabled) ACPI: LAPIC_NMI (acpi_id[0xff] high level lint[0x1]) ACPI: IOAPIC (id[0x08] address[0xfec0] gsi_base[0]) IOAPIC[0]: apic_id 8, address 0xfec0, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Setting APIC routing to flat ACPI: HPET id: 0x8086a201 base: 0xfed0 Using ACPI (MADT) for SMP configuration information swsusp: Registered nosave memory region: 0009f000 - 000f swsusp: Registered nosave memory region: 000f - 0010 Allocating PCI resources starting at 8800 (gap: 8000:6000) SMP: Allowing 8 CPUs, 7 hotplug CPUs PERCPU: Allocating 34400 bytes of per cpu data Built 1 zonelists in Node order, mobility grouping on. Total pages: 513267 Policy zone: DMA32 Kernel command line: root=/dev/md0 ro Initializing CPU#0 PID hash table entries: 4096 (order: 12, 32768 bytes) hpet clockevent registered TSC calibrated against HPET time.c: Detected 1595.991 MHz processor. Console: colour VGA+ 80x25 console [tty0] enabled Checking aperture... Calgary: detecting Calgary via BIOS EBDA area Calgary: Unable to locate Rio Grande table in EBDA - bailing! Memory: 2047256k/2086248k available (2148k kernel code, 38604k reserved, 1006k data, 316k init) Calibrating delay using timer specific routine.. 3194.59 BogoMIPS (lpj=6389190) Security Framework initialized SELinux: Disabled at boot. Capability LSM initialized Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes) Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes) Mount-cache hash table entries: 256 Initializing cgroup subsys ns Initializing cgroup subsys cpuacct CPU: L1 I cache: 32K, L1 D cache: 32K CPU: L2 cache: 512K CPU 0/0 - Node 0 using
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thursday, 21 of February 2008, Linus Torvalds wrote: On Wed, 20 Feb 2008, Rafael J. Wysocki wrote: which may have four entry-points that can be illogically mapped to the suspend/resume ones like we do now, but they really have nothing to do with suspending/resuming. Apart from putting devices into the right low power states, that is. And by right low power states you mean wrong low-power states, right? No, I don't. The thing is, they really *are* the wrong states for 99% of all hardware. If you really have a piece of hardware that you want to have the -poweroff() thing do the same as -suspend(), then hey, just use the same function (or better yet, use two different functions with a call to a shared part). Because IT IS NOT TRUE that -suspend() puts the devices in the right power state. The power states are likely to be totally different for S3 and for poweroff, and they are going to differ in different ways depending on the device type. In fact we have acpi_pci_choose_state() that tells the driver which power state to put the device into in -suspend(). If that is used, the device ends up in the state expected by to BIOS for S4. One example would be the one that started this version of the whole discussion (shock horror! We're on subject!) ie when you do a system shutdown, you generally do not even *want* to put individual devices into low-power states at all, because the actual power off the system thing will take care of it for you much better. No. Again, if there are devices that wake us up from S4, but not from S5, they need to be handled differently in the *enter S4* case (hibernation) and in the *enter S5* case (powering off the system). So to take just something as simple as VGA as an example: you really do not want to suspend that device, because you want to see the poweroff messages until the very end. So that final device -poweroff function really has absolutely *nothing* in common with the device -suspend[_late] functions, simply because almost any sane driver would decide to do different things. Yes, it would. Still, the common thing is, it (ie. -poweroff) _may_ want to put the device into a low power state different from D3. Of course, we can continue to do the insane thing and just continue to use inappropriate and misleadign function callback names, and then encodign what the *real* action should be in the argument and/or in magic system-wide state parameters. To clarify, I agree that we should use different callbacks for hibernation. I'm only saying that _in_ _general_ we may need the -poweroff callback. So in that sense, it's certainly totally the same thing whether we call it -shutdown or -poweroff or -eat_a_banana, since you could always just look at the argument and other clues, and decide that *this* time, for *this* kind of device, the eat a banana callback actually means that we should power it off, but wouldn't it be a lot more logical to just make it clear in the first place that they aren't called for the same reason at all? I'd claim that it's much easier for everybody (and _especially_ for device driver writers) to have static int my_shutdown(struct pci_device *dev, int state) { .. do something .. } static int my_suspend(struct pci_device *dev, int state) { .. do something .. } ... .shutdown = my_shutdown, .suspend = my_suspend, ... than to have static int my_suspend(struct pci_device *dev, state) { .. common code .. if (state == XYZZY) ..special code.. else ..other case code.. } ... .suspend = my_suspend ... even if the latter might be fewer lines. It doesn't really matter if it's fewer, does it, if the alternate version is more obvious about what it does? The other issue is that I've long wanted to make sure that when people fix suspend-to-ram, they don't screw up suspend-to-disk by mistake and vice versa. When a driver writer makes changes, he shouldn't have the kind of illogical oops, unintended consequences issues in general. It should be pretty damn obvious when he changes suspend code vs when he changes snapshot/restore code. We've somewhat untangled that on the core kernel layer, but we've left the driver confusion alone. Well, I agree with that. As I said before, that's mainly because I've been busy with other stuff recently. Now, with the Alex's help, I'm hoping to take care of it soon. Thanks, Rafael - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wednesday, February 20, 2008 3:03 pm Jesse Barnes wrote: On Wednesday, February 20, 2008 2:32 pm Jesse Barnes wrote: On Wednesday, February 20, 2008 11:10 am Jeff Chua wrote: On Feb 21, 2008 2:53 AM, Jesse Barnes [EMAIL PROTECTED] wrote: So, next I'll try shutdown to see if it work. I was using platform. Ok, that would be good to try. shutdown does power down properly. But still green on resume. Looks like the AR registers are hosed, which is what I thought I fixed... Can you attach your i915_drv.c file just so I can sanity check it? Attached. Jeff, for the hang on suspend problem, I know suspect something else in 2.6.25-rc2 caused that. Looks like 2.6.25-rc1 also had broken suspend (my test was broken). IIRC, Dave and I had it working at LCA using the out of tree DRM modules on 2.6.23.14 or 15... Maybe you could give that a try? And just to confirm that, I just tested the current DRM modules against a 2.6.23.15 kernel. It suspends to disk correctly (w/o a hang) and doesn't give me a green screen, so something in 2.6.25 must be causing that (even 2.6.25-rc1 seems to have the problem). Also, this patch against 2.6.25-rc1 seemed to prevent the 'green screen' problem. 2.6.25-rc2 already has part of it... Anyway, let me know how your testing goes. Thanks, Jesse - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thu, 21 Feb 2008, Rafael J. Wysocki wrote: In fact we have acpi_pci_choose_state() that tells the driver which power state to put the device into in -suspend(). If that is used, the device ends up in the state expected by to BIOS for S4. First off, nobody should *ever* use that directly anyway. Secondly, the one that people should use (pci_choose_state()) doesn't actually do what you claim it does. It does all kinds of wrong things, and doesn't even take the target state into account at all. So look again. No. Again, if there are devices that wake us up from S4, but not from S5, they need to be handled differently in the *enter S4* case (hibernation) and in the *enter S5* case (powering off the system). And again, what does this have to do with (the example I used) the graphics hardware? Answer: nothing. The example I gave you we simply DO THE WRONG THING FOR. Same thing for things like USB devices - where pci_choose_state() doesn't work to begin with. Why do we call suspend() on such a thing when we don't want to suspend it? We shouldn't. We should call freeze/unfreeze (which are no-ops) and then finally perhaps poweroff, and that final stage might want to spin things down or similar. But *none* of it has anything to do with suspend, and none of it has anything to do with pci_choose_state() (much less acpi_pci_choose_state) The fact is, we should let the driver decide, and we should make it clear to the driver writer what he is deciding about - rather than basically lie and say suspend the device and put it into D3 even when that's the last thing it should ever do. Linus - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thu, Feb 21, 2008 at 09:45:02AM +1100, Nigel Cunningham wrote: - people keep talking about hibernating to an ext3 fs mounted on fuse as a limitation of the freezer. To do that with kexec, you're still going to have to bmap the ext3 fs and pass the block list (in which case we can also do it without kexec) or umount all the ext3/fuse part and remount in the kexec'd kernel. Sort of defeats the purpose, doesn't it? No, with a freezer-based model you can basically *never* suspend to anything related to FUSE or a userspace USB device or anything involving userspace iSCSI initiators or whatever. Sure, there are cases where moving away from the current model doesn't buy you anything, but that doesn't mean that the current model is a good thing. It's not. The freezer is a fundamentally broken concept. I also wonder about how much of a pain it's going to be setting up userspace for this kexec'd kernel. Will you need a separate partition just for it? If not, will the userspace be loaded into memory all the time (more memory wasted for normal use), or loaded from ordinary partitions at kexec time (how to do safely? - more info to transfer between kernels?). You're looking at a tiny amount of memory when compared to current systems. It's really not a problem. -- Matthew Garrett | [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thursday, 21 of February 2008, Linus Torvalds wrote: On Thu, 21 Feb 2008, Rafael J. Wysocki wrote: In fact we have acpi_pci_choose_state() that tells the driver which power state to put the device into in -suspend(). If that is used, the device ends up in the state expected by to BIOS for S4. First off, nobody should *ever* use that directly anyway. Yes, sorry. Secondly, the one that people should use (pci_choose_state()) doesn't actually do what you claim it does. It does all kinds of wrong things, and doesn't even take the target state into account at all. So look again. Well, if platform_pci_choose_state() is defined, pci_choose_state() returns its result and on ACPI systems that points to acpi_pci_choose_state(), so in fact it does what I said (apart from the error path). No. Again, if there are devices that wake us up from S4, but not from S5, they need to be handled differently in the *enter S4* case (hibernation) and in the *enter S5* case (powering off the system). And again, what does this have to do with (the example I used) the graphics hardware? Answer: nothing. The example I gave you we simply DO THE WRONG THING FOR. Same thing for things like USB devices - where pci_choose_state() doesn't work to begin with. Why do we call suspend() on such a thing when we don't want to suspend it? We shouldn't. We should call freeze/unfreeze (which are no-ops) and then finally perhaps poweroff, and that final stage might want to spin things down or similar. I'm already convinced, really. :-) Thanks, Rafael - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wednesday, February 20, 2008 3:49 pm Rafael J. Wysocki wrote: And just to confirm that, I just tested the current DRM modules against a 2.6.23.15 kernel. In 2.6.23.x there's no second -suspend() during hibernation, so no wonder. In 2.6.23 it's just: -suspend() -resume() *S4* ? I ask because we still do the D3hot call in the DRM tree, so the hang should still occur unless the PM or ACPI core has changed. I'll figure out how to work around this issue in the current mainline, but a real fix will only be possible when we have separate callbacks for hibernation. Ok, thanks. Jesse - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thu, 21 Feb 2008, Rafael J. Wysocki wrote: Secondly, the one that people should use (pci_choose_state()) doesn't actually do what you claim it does. It does all kinds of wrong things, and doesn't even take the target state into account at all. So look again. Well, if platform_pci_choose_state() is defined, pci_choose_state() returns its result and on ACPI systems that points to acpi_pci_choose_state(), so in fact it does what I said (apart from the error path). Did you check closer? I repeat: acpi_pci_choose_state() (when called from pci_choose_state()) doesn't even look at the target 'state'. It just blindly assumes that you want the deepest sleep-state you can have. Which happens to be correct for normal suspend, but means that if you want to test other states (through '/sys/devices/.../power'), that sounds broken. I didn't check any closer, but go check it yourself. The short and sweet: acpi_pci_choose_state() totally ignores its 'state' argument. Do you really think that's correct? But yes, pci_choose_state()' effectively does that too, apart from PM_EVENT_ON, which is never used. (But the whole and only point of pci_choose_state() was to do the PM_EVENT_FREEZE thing differently, which it doesn't do, so I think the real issue here is that the interface is really rather mis-designed) I suspect most people who ever really looked and worked on this code had a specific device in mind, and I'm sure that all of the code individually always ends up making sense from the standpoint of some specific device driver. It's just that it never seems to make sense from a bigger issues standpoint, and often seems senseless from the standpoint of other devices of other types. Linus - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thu, Feb 21, 2008 at 5:37 AM, Jesse Barnes [EMAIL PROTECTED] wrote: Ok, can you give this patch a try with the 'platform' method? It should at least tell us what ACPI would like the device to do at suspend time, but it probably won't fix the hang. I can't get it to compile. drivers/char/drm/i915_drv.c: In function 'i915_suspend': drivers/char/drm/i915_drv.c:372: error: implicit declaration of function 'acpi_pci_choose_state' drivers/char/drm/i915_drv.c: In function 'i915_resume': drivers/char/drm/i915_drv.c:383: error: 'state' undeclared (first use in this function) drivers/char/drm/i915_drv.c:383: error: (Each undeclared identifier is reported only once drivers/char/drm/i915_drv.c:383: error: for each function it appears in.) make[3]: *** [drivers/char/drm/i915_drv.o] Error 1 make[2]: *** [drivers/char/drm] Error 2 make[1]: *** [drivers/char] Error 2 make: *** [drivers] Error 2 Thanks, Jeff. - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
Hi. Matthew Garrett wrote: On Thu, Feb 21, 2008 at 09:45:02AM +1100, Nigel Cunningham wrote: - people keep talking about hibernating to an ext3 fs mounted on fuse as a limitation of the freezer. To do that with kexec, you're still going to have to bmap the ext3 fs and pass the block list (in which case we can also do it without kexec) or umount all the ext3/fuse part and remount in the kexec'd kernel. Sort of defeats the purpose, doesn't it? No, with a freezer-based model you can basically *never* suspend to anything related to FUSE or a userspace USB device or anything involving userspace iSCSI initiators or whatever. Sure, there are cases where moving away from the current model doesn't buy you anything, but that doesn't mean that the current model is a good thing. It's not. The freezer is a fundamentally broken concept. Putting drivers and filesystems in userspace is the fundamentally broken concept. Not just when it comes to the freezer. The whole idea is inherently racy. You can draw silly diagrams about how the freezer supposedly works in LCA slides and spread FUD as much as you like. In the end, though, it's not nearly as hit-and-miss as you say, and replacing the freezer with a kexec based freezer is only going to create as many problems as it removes. I also wonder about how much of a pain it's going to be setting up userspace for this kexec'd kernel. Will you need a separate partition just for it? If not, will the userspace be loaded into memory all the time (more memory wasted for normal use), or loaded from ordinary partitions at kexec time (how to do safely? - more info to transfer between kernels?). You're looking at a tiny amount of memory when compared to current systems. It's really not a problem. Please, quantify 'tiny'. In embedded, 5MB can be too much. I've worked on embedded solutions. I'm not pulling problems out of thin air. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thu, Feb 21, 2008 at 11:40:06AM +1100, Nigel Cunningham wrote: Hi. Matthew Garrett wrote: On Thu, Feb 21, 2008 at 09:45:02AM +1100, Nigel Cunningham wrote: - people keep talking about hibernating to an ext3 fs mounted on fuse as a limitation of the freezer. To do that with kexec, you're still going to have to bmap the ext3 fs and pass the block list (in which case we can also do it without kexec) or umount all the ext3/fuse part and remount in the kexec'd kernel. Sort of defeats the purpose, doesn't it? No, with a freezer-based model you can basically *never* suspend to anything related to FUSE or a userspace USB device or anything involving userspace iSCSI initiators or whatever. Sure, there are cases where moving away from the current model doesn't buy you anything, but that doesn't mean that the current model is a good thing. It's not. The freezer is a fundamentally broken concept. Putting drivers and filesystems in userspace is the fundamentally broken concept. Not just when it comes to the freezer. The whole idea is inherently racy. Racy with regards to other things becides trying to suspend a machine? If so, what? thanks, greg k-h - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wednesday, February 20, 2008 4:35 pm Jeff Chua wrote: On Thu, Feb 21, 2008 at 5:37 AM, Jesse Barnes [EMAIL PROTECTED] wrote: Ok, can you give this patch a try with the 'platform' method? It should at least tell us what ACPI would like the device to do at suspend time, but it probably won't fix the hang. I can't get it to compile. drivers/char/drm/i915_drv.c: In function 'i915_suspend': drivers/char/drm/i915_drv.c:372: error: implicit declaration of function 'acpi_pci_choose_state' Oops, maybe this should just be pci_choose_state instead. drivers/char/drm/i915_drv.c: In function 'i915_resume': drivers/char/drm/i915_drv.c:383: error: 'state' undeclared (first use in this function) drivers/char/drm/i915_drv.c:383: error: (Each undeclared identifier is reported only once drivers/char/drm/i915_drv.c:383: error: for each function it appears in.) And this change should just be reverted (leave it as PCI_D0). Thanks, Jesse - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thursday, 21 of February 2008, Linus Torvalds wrote: On Thu, 21 Feb 2008, Rafael J. Wysocki wrote: Secondly, the one that people should use (pci_choose_state()) doesn't actually do what you claim it does. It does all kinds of wrong things, and doesn't even take the target state into account at all. So look again. Well, if platform_pci_choose_state() is defined, pci_choose_state() returns its result and on ACPI systems that points to acpi_pci_choose_state(), so in fact it does what I said (apart from the error path). Did you check closer? Yes, I did. I repeat: acpi_pci_choose_state() (when called from pci_choose_state()) doesn't even look at the target 'state'. It just blindly assumes that you want the deepest sleep-state you can have. acpi_pm_device_sleep_state() (that is called by acpi_pci_choose_state()) takes the target state directly from the ACPI layer. We just want to get rid of the argument passed to -suspend() eventually, but there may be many _suspend_ states available (eg. mem and standby) and for each of them there may be different constraints on the device's state. We have to tell the driver which device states are possible in the target system sleep state. Right now we arbitrarily choose the one with the lowest power usage - for given target system sleep state. Which happens to be correct for normal suspend, but means that if you want to test other states (through '/sys/devices/.../power'), that sounds broken. This interface is not available any more (ie. there's only wakeup in /sys/devices/.../power). I didn't check any closer, but go check it yourself. The short and sweet: acpi_pci_choose_state() totally ignores its 'state' argument. Do you really think that's correct? Yes, I do. But yes, pci_choose_state()' effectively does that too, apart from PM_EVENT_ON, which is never used. (But the whole and only point of pci_choose_state() was to do the PM_EVENT_FREEZE thing differently, which it doesn't do, so I think the real issue here is that the interface is really rather mis-designed) You're wrong, sorry. With PM_EVENT_FREEZE it wouldn't even be necessary. It's there, because potentially there are many possibilities with PM_EVENT_SUSPEND and in fact it shouldn't even be used with PM_EVENT_FREEZE. All of this is more or less orthogonal to the issue at hand, which boils down to the fact that we use the _suspend_ callbacks for hibernation and we shouldn't be doing that. Thanks, Rafael - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Suspend-devel] 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thursday, 21 of February 2008, Jesse Barnes wrote: On Wednesday, February 20, 2008 3:49 pm Rafael J. Wysocki wrote: And just to confirm that, I just tested the current DRM modules against a 2.6.23.15 kernel. In 2.6.23.x there's no second -suspend() during hibernation, so no wonder. In 2.6.23 it's just: -suspend() -resume() -shutdown() (that breaks wake up from S4 with many devices, including but not limited to the RTC wake alarm). *S4* ? Thanks, Rafael - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thu, Feb 21, 2008 at 11:40:06AM +1100, Nigel Cunningham wrote: Matthew Garrett wrote: No, with a freezer-based model you can basically *never* suspend to anything related to FUSE or a userspace USB device or anything involving userspace iSCSI initiators or whatever. Sure, there are cases where moving away from the current model doesn't buy you anything, but that doesn't mean that the current model is a good thing. It's not. The freezer is a fundamentally broken concept. Putting drivers and filesystems in userspace is the fundamentally broken concept. Not just when it comes to the freezer. The whole idea is inherently racy. You can draw silly diagrams about how the freezer supposedly works in LCA slides and spread FUD as much as you like. In the end, though, it's not nearly as hit-and-miss as you say, and replacing the freezer with a kexec based freezer is only going to create as many problems as it removes. I'm really not interested in debating the matter. There are all sorts of potential uses for the freezer, but hibernation isn't one of them. We *need* to get rid of the freezer for suspend to RAM (because a band-aid to ensure atomicity is kind of pointless when the operation you're entering is inherently atomic), and once all the drivers are able to deal with that then it's trivial to get rid of it for hibernation as well. Arguing that the reality of userspace drivers is broken doesn't help here. It's what we have to work with. You're looking at a tiny amount of memory when compared to current systems. It's really not a problem. Please, quantify 'tiny'. In embedded, 5MB can be too much. I've worked on embedded solutions. I'm not pulling problems out of thin air. Then the in-kernel solution has already lost anyway, and I'm desperately unconcerned about out of tree stuff. -- Matthew Garrett | [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
Hi. Greg KH wrote: On Thu, Feb 21, 2008 at 11:40:06AM +1100, Nigel Cunningham wrote: Hi. Matthew Garrett wrote: On Thu, Feb 21, 2008 at 09:45:02AM +1100, Nigel Cunningham wrote: - people keep talking about hibernating to an ext3 fs mounted on fuse as a limitation of the freezer. To do that with kexec, you're still going to have to bmap the ext3 fs and pass the block list (in which case we can also do it without kexec) or umount all the ext3/fuse part and remount in the kexec'd kernel. Sort of defeats the purpose, doesn't it? No, with a freezer-based model you can basically *never* suspend to anything related to FUSE or a userspace USB device or anything involving userspace iSCSI initiators or whatever. Sure, there are cases where moving away from the current model doesn't buy you anything, but that doesn't mean that the current model is a good thing. It's not. The freezer is a fundamentally broken concept. Putting drivers and filesystems in userspace is the fundamentally broken concept. Not just when it comes to the freezer. The whole idea is inherently racy. Racy with regards to other things becides trying to suspend a machine? If so, what? That depends on what sort of tangled web you want to weave. Low memory situations is one other situation that occurs to me quickly, especially (though not only) if your ability to swap were to depend upon a userspace driver and/or filesystem. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thu, Feb 21, 2008 at 8:39 AM, Jesse Barnes [EMAIL PROTECTED] wrote: Oops, maybe this should just be pci_choose_state instead. And this change should just be reverted (leave it as PCI_D0). drivers/char/drm/i915_drv.c: In function 'i915_suspend': drivers/char/drm/i915_drv.c:372: warning: passing argument 1 of 'pci_choose_state' from incompatible pointer type drivers/char/drm/i915_drv.c:373: warning: passing argument 1 of 'pci_choose_state' from incompatible pointer type I hope those are just warning that can just be ignored. Ok, rebooting and will get back shortly. Thanks, Jeff. - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Wednesday, February 20, 2008 5:19 pm Jeff Chua wrote: On Thu, Feb 21, 2008 at 8:39 AM, Jesse Barnes [EMAIL PROTECTED] wrote: Oops, maybe this should just be pci_choose_state instead. And this change should just be reverted (leave it as PCI_D0). drivers/char/drm/i915_drv.c: In function 'i915_suspend': drivers/char/drm/i915_drv.c:372: warning: passing argument 1 of 'pci_choose_state' from incompatible pointer type drivers/char/drm/i915_drv.c:373: warning: passing argument 1 of 'pci_choose_state' from incompatible pointer type I hope those are just warning that can just be ignored. Oops again, should be dev-pdev. Silly DRM layer obfuscation. Jesse - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thu, Feb 21, 2008 at 9:21 AM, Jesse Barnes [EMAIL PROTECTED] wrote: I hope those are just warning that can just be ignored. Oops again, should be dev-pdev. Silly DRM layer obfuscation. I was just about to write that the test didn't work. Both std str hangs even before attempting to suspend. Anyway, I'm compiling and rebooting now. Thanks, Jeff. - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
Hi. Matthew Garrett wrote: On Thu, Feb 21, 2008 at 11:40:06AM +1100, Nigel Cunningham wrote: Matthew Garrett wrote: No, with a freezer-based model you can basically *never* suspend to anything related to FUSE or a userspace USB device or anything involving userspace iSCSI initiators or whatever. Sure, there are cases where moving away from the current model doesn't buy you anything, but that doesn't mean that the current model is a good thing. It's not. The freezer is a fundamentally broken concept. Putting drivers and filesystems in userspace is the fundamentally broken concept. Not just when it comes to the freezer. The whole idea is inherently racy. You can draw silly diagrams about how the freezer supposedly works in LCA slides and spread FUD as much as you like. In the end, though, it's not nearly as hit-and-miss as you say, and replacing the freezer with a kexec based freezer is only going to create as many problems as it removes. I'm really not interested in debating the matter. There are all sorts of potential uses for the freezer, but hibernation isn't one of them. We *need* to get rid of the freezer for suspend to RAM (because a band-aid to ensure atomicity is kind of pointless when the operation you're entering is inherently atomic), and once all the drivers are able to deal with that then it's trivial to get rid of it for hibernation as well. Arguing that the reality of userspace drivers is broken doesn't help here. It's what we have to work with. Re suspend to ram, I agree. No argument there. Re hibernation, I think your assertion that it will be trivial to get rid of it for hibernation is just plain wrong. Perhaps you don't understand the issues as well as you think you do. Re arguing that the reality of userspace drivers is broken doesn't help here: Yeah, I know. But sometimes if you point out broken ideas for long enough, people do actually listen. Or you learn. Or both. Frankly, I don't want to debate the issue either. What I really want is just to have a hibernation implementation that works, is flexibile, reliable and quick, and one that I don't have to keep maintaining. Unfortunately for me, most people seem to be more concerned with fixing hypothetical problems than with giving users something they can actually use. You're looking at a tiny amount of memory when compared to current systems. It's really not a problem. Please, quantify 'tiny'. In embedded, 5MB can be too much. I've worked on embedded solutions. I'm not pulling problems out of thin air. Then the in-kernel solution has already lost anyway, and I'm desperately unconcerned about out of tree stuff. I know. I'd submit it, or work on breaking it into pieces and submitting them one at a time, but that seems to me to be a waste of time. Nigel - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thu, Feb 21, 2008 at 8:39 AM, Jesse Barnes [EMAIL PROTECTED] wrote: On Wednesday, February 20, 2008 4:35 pm Jeff Chua wrote: On Thu, Feb 21, 2008 at 5:37 AM, Jesse Barnes [EMAIL PROTECTED] wrote: Ok, can you give this patch a try with the 'platform' method? It should at least tell us what ACPI would like the device to do at suspend time, but it probably won't fix the hang. It says calling pci_set_power_state with 3. Then after all then it still hangs, and then resume with Mr Green. PM: Syncing filesystems ... done. Freezing user space processes ... (elapsed 0.00 seconds) done. Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done. PM: Shrinking memory... ^H-^Hdone (0 pages freed) PM: Freed 0 kbytes in 0.20 seconds (0.00 MB/s) ACPI: Preparing to enter system sleep state S4 Suspending console(s) sd 0:0:0:0: [sda] Synchronizing SCSI cache drm_sysfs_suspend ACPI: PCI interrupt for device :00:02.0 disabled calling pci_set_power_state with 3 ACPI: PCI interrupt for device :00:1d.7 disabled ACPI: PCI interrupt for device :00:1d.3 disabled ACPI: PCI interrupt for device :00:1d.2 disabled ACPI: PCI interrupt for device :00:1d.1 disabled ACPI: PCI interrupt for device :00:1d.0 disabled ACPI: PCI interrupt for device :00:1b.0 disabled Disabling non-boot CPUs ... PM: Creating hibernation image: PM: Need to copy 25136 pages tick-braodcast: ignoring broadcast for offline CPU #1 PM: Writing back config space on device :00:02.0 at offset 1 (was 97, writing 93) ACPI: PCI Interrupt :00:1b.0[B] - GSI 17 (level, low) - IRQ 17 PCI: Setting latency timer of device :00:1b.0 to 64 PCI: Setting latency timer of device :00:1c.0 to 64 PCI: Setting latency timer of device :00:1c.1 to 64 ... Thanks, Jeff. - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thu, Feb 21, 2008 at 12:17:06PM +1100, Nigel Cunningham wrote: Hi. Greg KH wrote: On Thu, Feb 21, 2008 at 11:40:06AM +1100, Nigel Cunningham wrote: Hi. Matthew Garrett wrote: On Thu, Feb 21, 2008 at 09:45:02AM +1100, Nigel Cunningham wrote: - people keep talking about hibernating to an ext3 fs mounted on fuse as a limitation of the freezer. To do that with kexec, you're still going to have to bmap the ext3 fs and pass the block list (in which case we can also do it without kexec) or umount all the ext3/fuse part and remount in the kexec'd kernel. Sort of defeats the purpose, doesn't it? No, with a freezer-based model you can basically *never* suspend to anything related to FUSE or a userspace USB device or anything involving userspace iSCSI initiators or whatever. Sure, there are cases where moving away from the current model doesn't buy you anything, but that doesn't mean that the current model is a good thing. It's not. The freezer is a fundamentally broken concept. Putting drivers and filesystems in userspace is the fundamentally broken concept. Not just when it comes to the freezer. The whole idea is inherently racy. Racy with regards to other things becides trying to suspend a machine? If so, what? That depends on what sort of tangled web you want to weave. Lots of them :) We have tanks running Linux using userspace USB drivers for vision control systems (scary, I know...) They seem to be successfully running for many years now, and I'm interested in making sure those kinds of things keep working. We also have laser welding robots with userspace PCI drivers in car manufacturing plants. And other laser cutting robots slicing wood in patterns moving at a rate of over 3 meters a second. Again, with userspace drivers and Linux. Those users would also love to know of any potential problems you know of for this situation. Low memory situations is one other situation that occurs to me quickly, especially (though not only) if your ability to swap were to depend upon a userspace driver and/or filesystem. Sure, swap over a userspace filesystem or driver isn't a sane idea. And neither is swaping over NFS over a PPP connection attached to a USB to serial device. Yes, it's possible, and all in the kernel, but not a wise decision. Other than foolish configurations, if you come up with other issues surrounding userspace drivers that could cause problems, please let me know. thanks, greg k-h - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel Version specific vendor override possibilities needed - Revert and provide osi=linux or provide a replacement
On Thu, 2008-02-21 at 00:13 -0300, Henrique de Moraes Holschuh wrote: On Wed, 20 Feb 2008, Matthew Garrett wrote: It doesn't punish them. They're the ones who are going to work with us to ensure that Linux works on their hardware, and their needs are going And since when we have to work exactly like Windows (whatever version) does in THAT case? Also, why would one thing (proper replacement for OSI(Linux)) cause any sort of difference over the other (trying to be bug-to-bug compatible with Microsoft crap). I agree with Henrique. Since we have, in fact, more Windows cases than Linux cases and for Linux, just announce that is recommended, osi=linux. Also it is possible put Linuxs bios in a kind of white-list in kernel code ... -- Sérgio M. B. smime.p7s Description: S/MIME cryptographic signature
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
Hi Greg. Greg KH wrote: On Thu, Feb 21, 2008 at 12:17:06PM +1100, Nigel Cunningham wrote: Hi. Greg KH wrote: On Thu, Feb 21, 2008 at 11:40:06AM +1100, Nigel Cunningham wrote: Hi. Matthew Garrett wrote: On Thu, Feb 21, 2008 at 09:45:02AM +1100, Nigel Cunningham wrote: - people keep talking about hibernating to an ext3 fs mounted on fuse as a limitation of the freezer. To do that with kexec, you're still going to have to bmap the ext3 fs and pass the block list (in which case we can also do it without kexec) or umount all the ext3/fuse part and remount in the kexec'd kernel. Sort of defeats the purpose, doesn't it? No, with a freezer-based model you can basically *never* suspend to anything related to FUSE or a userspace USB device or anything involving userspace iSCSI initiators or whatever. Sure, there are cases where moving away from the current model doesn't buy you anything, but that doesn't mean that the current model is a good thing. It's not. The freezer is a fundamentally broken concept. Putting drivers and filesystems in userspace is the fundamentally broken concept. Not just when it comes to the freezer. The whole idea is inherently racy. Racy with regards to other things becides trying to suspend a machine? If so, what? That depends on what sort of tangled web you want to weave. Lots of them :) We have tanks running Linux using userspace USB drivers for vision control systems (scary, I know...) They seem to be successfully running for many years now, and I'm interested in making sure those kinds of things keep working. We also have laser welding robots with userspace PCI drivers in car manufacturing plants. And other laser cutting robots slicing wood in patterns moving at a rate of over 3 meters a second. Again, with userspace drivers and Linux. Those users would also love to know of any potential problems you know of for this situation. Low memory situations is one other situation that occurs to me quickly, especially (though not only) if your ability to swap were to depend upon a userspace driver and/or filesystem. Sure, swap over a userspace filesystem or driver isn't a sane idea. And neither is swaping over NFS over a PPP connection attached to a USB to serial device. Yes, it's possible, and all in the kernel, but not a wise decision. Other than foolish configurations, if you come up with other issues surrounding userspace drivers that could cause problems, please let me know. A simple OOM condition isn't an issue? Surely a driver stalling because some of its memory gets swapped out just before it goes to use it would be a problem if it resulted in getting the length of a cut wrong or caused some distorted vision or a late turn : Am I missing something? Maybe these drivers mlock memory to avoid those issues or something like that? Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.
On Thu, Feb 21, 2008 at 05:05:32PM +1100, Nigel Cunningham wrote: Hi Greg. Greg KH wrote: On Thu, Feb 21, 2008 at 12:17:06PM +1100, Nigel Cunningham wrote: Hi. Greg KH wrote: On Thu, Feb 21, 2008 at 11:40:06AM +1100, Nigel Cunningham wrote: Hi. Matthew Garrett wrote: On Thu, Feb 21, 2008 at 09:45:02AM +1100, Nigel Cunningham wrote: - people keep talking about hibernating to an ext3 fs mounted on fuse as a limitation of the freezer. To do that with kexec, you're still going to have to bmap the ext3 fs and pass the block list (in which case we can also do it without kexec) or umount all the ext3/fuse part and remount in the kexec'd kernel. Sort of defeats the purpose, doesn't it? No, with a freezer-based model you can basically *never* suspend to anything related to FUSE or a userspace USB device or anything involving userspace iSCSI initiators or whatever. Sure, there are cases where moving away from the current model doesn't buy you anything, but that doesn't mean that the current model is a good thing. It's not. The freezer is a fundamentally broken concept. Putting drivers and filesystems in userspace is the fundamentally broken concept. Not just when it comes to the freezer. The whole idea is inherently racy. Racy with regards to other things becides trying to suspend a machine? If so, what? That depends on what sort of tangled web you want to weave. Lots of them :) We have tanks running Linux using userspace USB drivers for vision control systems (scary, I know...) They seem to be successfully running for many years now, and I'm interested in making sure those kinds of things keep working. We also have laser welding robots with userspace PCI drivers in car manufacturing plants. And other laser cutting robots slicing wood in patterns moving at a rate of over 3 meters a second. Again, with userspace drivers and Linux. Those users would also love to know of any potential problems you know of for this situation. Low memory situations is one other situation that occurs to me quickly, especially (though not only) if your ability to swap were to depend upon a userspace driver and/or filesystem. Sure, swap over a userspace filesystem or driver isn't a sane idea. And neither is swaping over NFS over a PPP connection attached to a USB to serial device. Yes, it's possible, and all in the kernel, but not a wise decision. Other than foolish configurations, if you come up with other issues surrounding userspace drivers that could cause problems, please let me know. A simple OOM condition isn't an issue? Surely a driver stalling because some of its memory gets swapped out just before it goes to use it would be a problem if it resulted in getting the length of a cut wrong or caused some distorted vision or a late turn : Am I missing something? Maybe these drivers mlock memory to avoid those issues or something like that? I think the mlock their memory to prevent this from happening, it's not hard when you control all the applications on the box :) thanks, greg k-h - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUILD_FAILURE] 2.6.25-rc2-mm1 - Build Failure at acpi_os
On Saturday 16 February 2008 14:47, Kamalesh Babulal wrote: Hi Andrew, The 2.6.25-rc2-mm1 kernel with randconfig build option, fails to build on x86_64 machine CC drivers/acpi/osl.o drivers/acpi/osl.c:60:38: error: empty filename in #include drivers/acpi/osl.c: In function ‘acpi_os_table_override’: drivers/acpi/osl.c:399: error: ‘AmlCode’ undeclared (first use in this function) drivers/acpi/osl.c:399: error: (Each undeclared identifier is reported only once drivers/acpi/osl.c:399: error: for each function it appears in.) make[2]: *** [drivers/acpi/osl.o] Error 1 make[1]: *** [drivers/acpi] Error 2 make: *** [drivers] Error 2 # # Automatically generated make config: don't edit # Linux kernel version: 2.6.25-rc2-mm1 # Sun Feb 17 08:07:17 2008 # CONFIG_ACPI_CUSTOM_DSDT=y CONFIG_ACPI_CUSTOM_DSDT_FILE= garbage in, garbage out. If you don't give this build option a file name where AmlCode lives, then the build will be unable to find AmlCode[]. http://www.lesswatts.org/projects/acpi/overridingDSDT.php cheers, -Len - To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html