Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy
Hi. On Thu, 2007-04-19 at 00:22 +0200, Christian Hesse wrote: On Thursday 19 April 2007, Ingo Molnar wrote: * Christian Hesse [EMAIL PROTECTED] wrote: although probably your suspend2 problem is still not fixed, it's worth a try nevertheless. Which suspend2 patch did you apply, and was it against -rc6 or -rc7? You are right again. ;-) Linux 2.6.21-rc7 Suspend2 2.2.9.11 (applies cleanly to -rc7) CFS v3 (without any additional patches) And it still hangs on suspend. what's the easiest way for me to try suspend2? Apply the patch, reboot into the kernel, then execute what command to suspend? (there's a confusing mismash of initiators of all the suspend variants. Can i drive this by echoing to /sys/power/state?) Perhaps you have to install suspend2-userui as well for the output (I'm not shure whether it works without). Then you can trigger the suspend by echoing to /sys/power/suspend2/do_suspend. Useful informations can be found in the Howto: http://www.suspend2.net/HOWTO I dropped some ccs to not abuse Linus and friends. You can suspend and resume without it. Regards, Nigel signature.asc Description: This is a digitally signed message part
Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy
Hi. On Wed, 2007-04-18 at 18:56 -0400, Bob Picco wrote: Ingo Molnar wrote:[Wed Apr 18 2007, 06:02:28PM EDT] * Christian Hesse [EMAIL PROTECTED] wrote: although probably your suspend2 problem is still not fixed, it's worth a try nevertheless. Which suspend2 patch did you apply, and was it against -rc6 or -rc7? You are right again. ;-) Linux 2.6.21-rc7 Suspend2 2.2.9.11 (applies cleanly to -rc7) CFS v3 (without any additional patches) And it still hangs on suspend. what's the easiest way for me to try suspend2? Apply the patch, reboot into the kernel, then execute what command to suspend? (there's a confusing mismash of initiators of all the suspend variants. Can i drive this by echoing to /sys/power/state?) Ingo I had hoped to collect more data with CFS V2. It crashes in scale_nice_down for s2ram when attempting to disable_nonboot_cpus. So part of traceback looks like (typed by hand with obvious omissions): scale_nice_down update_stats_wait_end - not shown in traceback because inlined pick_next_task_fair migration_call task_rq_lock notifier_call_chain _cpu_down disable_nonboot_cpus ... This is standard -rc7 with V2 CFS applied. It could be a completely unrelated issue. I'll attempt to debug further tomorrow. That - and Christian's other reply with the jpg - look to me more like this is an interaction between CFS and cpu hotplugging than Suspend2 itself. Can you also reproduce this with swsusp? Regards, Nigel signature.asc Description: This is a digitally signed message part
Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy
Hi. On Thu, 2007-04-19 at 00:02 +0200, Ingo Molnar wrote: * Christian Hesse [EMAIL PROTECTED] wrote: although probably your suspend2 problem is still not fixed, it's worth a try nevertheless. Which suspend2 patch did you apply, and was it against -rc6 or -rc7? You are right again. ;-) Linux 2.6.21-rc7 Suspend2 2.2.9.11 (applies cleanly to -rc7) CFS v3 (without any additional patches) And it still hangs on suspend. what's the easiest way for me to try suspend2? Apply the patch, reboot into the kernel, then execute what command to suspend? (there's a confusing mismash of initiators of all the suspend variants. Can i drive this by echoing to /sys/power/state?) From subsequent emails, I think you already got your answer, but just in case... Yes, if you enabled Replace swsusp by default and you already had it set up for getting swsusp to resume. If not, and you're using an initrd/ramfs, you'll need to modify it to echo /sys/power/suspend2/do_resume after /sys and /proc are mounted but prior to mounting / and so on. Regards, Nigel signature.asc Description: This is a digitally signed message part
Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy
Hi Ingo. On Thu, 2007-04-19 at 09:04 +0200, Ingo Molnar wrote: * Nigel Cunningham [EMAIL PROTECTED] wrote: From subsequent emails, I think you already got your answer, but just in case... Yes, if you enabled Replace swsusp by default and you already had it set up for getting swsusp to resume. If not, and you're using an initrd/ramfs, you'll need to modify it to echo /sys/power/suspend2/do_resume after /sys and /proc are mounted but prior to mounting / and so on. yeah, went with the default suggested by your patch: CONFIG_SUSPEND2_REPLACE_SWSUSP=y and it was pretty easy to set things up. I used echo disk /sys/power/state to trigger it. In hindsight it was all pretty straightforward and suspend2 worked beautifully on an UP and on an SMP system i tried. So in exchange for suspend2 folks debugging a bug in CFS here's some suspend2 review feedback ;) Any plans about moving suspend2 to the upstream kernel? It should be pretty easy for it to co-exist with the current swsuspend code. I really would like to get it into Linus' tree but Pavel doesn't want it (obviously!) and I haven't got together enough of a case yet to convince Andrew. I yet another here's-why-I-think-it-should-be-merged email in the works (poor Andrew!) but there are too many other things on my plate at the mo. The patch has quite some size: 89 files changed, 16452 insertions(+), 69 deletions(-) that should obviously be split up into more than a dozen sub-patches, and fed to lkml with the small ones first. (unless it already is split up?) Right. A good portion (~2000 lines) of that is documentation. i cannot comment on the kernel/power/ bits (they are way too large anyway), other than that they look pretty clean visually, but the lowlevel arch and generic kernel bits look sane in detail too, sans a few mostly trivial cleanliness issues: +int suspend2_faulted = 0; +EXPORT_SYMBOL(suspend2_faulted); should be done via the pagefault notifier chain mechanism. Also, all the exports you added should be EXPORT_SYMBOL_GPL(). I'll look at that, but I'm not sure if it's a good idea - this is for during the atomic copy restore, when DEBUG_PAGEALLOC is enabled on x86. Other things might touch memory in ways we don't want. It's only needed for slab pages that get unmapped but not freed. As far as the module exports go, I'm not expecting them to get merged. I like building Suspend2 as modules (it helps speed the development cycle), and see it as potentially useful for embedded but IMO there are too many export symbols to make merging that code a possibility. This is why they're all in one file rather than sprinkled through the files that define the symbols. this: - ClearPageReserved(virt_to_page(addr)); - init_page_count(virt_to_page(addr)); + //ClearPageReserved(virt_to_page(addr)); + //init_page_count(virt_to_page(addr)); looks like there's a buglet in there still somewhere? Yeah. When I was recently debugging, I found that cpu hotplugging is using something marked __init which is causing the machine to spontaneously reboot when cpus are replugged if DEBUG_PAGEALLOC is enabled. Haven't had the time to get back to it, and also need some help with the approach (what makes the machine reboot in this case instead of oopsing, and how do I stop it?). + if(PageHighMem(page)) + return 0; coding style. Oh. The space missing after the if? Ok. + BUG_ON( test_suspend_state(SUSPEND_RUNNING) /* Suspend2, that is */ make this a WARN_ON() or a WARN_ON_ONCE() - that way you have a chance to even get feedback from users, instead of a 'uhm, X froze' report. +#define FREEZER_OFF 0 +#define FREEZER_USERSPACE_FROZEN 1 +#define FREEZER_FULLY_ON 2 should be: +#define FREEZER_OFF 0 +#define FREEZER_USERSPACE_FROZEN 1 +#define FREEZER_FULLY_ON 2 (you want your reviewers have an pleasant time reading your code :) Ok. +#define NETLINK_SUSPEND2_USERUI20 /* For suspend2's userui */ IIRC userui was at the center of suspend2 merge flames, right? So you might want to layer it ontop a less flashy suspend2-core and thus get 90% of your patch upstream? Ok. I've just separated that into it's own file/module, so that will be straightforward to do. +++ linux/mm/vmscan.c the MM impact looks quite nontrivial. But i suspect this is unavoidable, because you zap portions of the pagecache on the way to disk, so when it comes back it results in a different pagecache (new lru lists, etc.), right? The modifications do three things. First, we're seeking to keep the LRU static once while we're suspending. I originally sought to achieve that by avoiding entering the vmscan.c logic (not as drastic as it sounds - Suspend2 is the only thing running!). I think it was Nick who said he'd rather see it the pages unlinked and kept safe
Re: VMWare Workstation 6 for debugging Linux Kernel (!)
Hi. On Fri, 2007-04-20 at 14:45 +0300, Avi Kivity wrote: Andi Kleen wrote: Xavier Bestel [EMAIL PROTECTED] writes: On Fri, 2007-04-20 at 00:46 +0200, roland wrote: We just quietly added an exciting feature to Workstation 6.0. I believe it will make WS6 a great tool for Linux kernel development. You can now debug kernel of Linux VM with gdb running on the Host without changing anything in the Guest VM. No kdb, no recompiling and no need for second machine. All you need is a single line in VM's configuration file. I think qemu has the exact same feature. It doesn't seem to work for x86-64 there though. kvm's qemu has a patch that allows qemu to be an x86_64 gdbserver (with or without kvm). I was meaning that vmware wasn't working, but it is now - I was trying a 64 host and client, and needed to know both the different line in the config file and the different port number. Regards, Nigel signature.asc Description: This is a digitally signed message part
Re: VMWare Workstation 6 for debugging Linux Kernel (!)
Hi. On Fri, 2007-04-20 at 04:21 -0700, Petr Vandrovec wrote: Andi Kleen wrote: Xavier Bestel [EMAIL PROTECTED] writes: On Fri, 2007-04-20 at 00:46 +0200, roland wrote: We just quietly added an exciting feature to Workstation 6.0. I believe it will make WS6 a great tool for Linux kernel development. You can now debug kernel of Linux VM with gdb running on the Host without changing anything in the Guest VM. No kdb, no recompiling and no need for second machine. All you need is a single line in VM's configuration file. I think qemu has the exact same feature. It doesn't seem to work for x86-64 there though. Hello, Do you mean with qemu or with VMware? Yes, we do not support replay with 64bit guests, but debug interface should just work. Only gotcha is that for 64bit guest you need another option: debugStub.listen.guest64 = TRUE Ah. That might help :) and then you need to attach gdb to port 8864 (*). Unfortunately it does not seem possible to build gdb which would support 16bit/32bit code while using 64bit gdb on-wire format, so there are two interfaces. And if you single-step switch from 64bit mode to 32bit mode or back, you also have to switch gdbs. Yes, it is a bit unintuitive, and additionally one gdb silently ignores breakpoints set up by other gdb, so you need to keep breakpoints in sync between two gdbs yourself :-( (*) If you are using gdb which has both 32bit and 64bit support, be sure to issue appropriate 'set architecture xxx' before 'target remote localhost:88xx' (i386:x86-64 for port 8864, i386 or i8086 for port 8832). Otherwise gdb is going to die complaining it could not parse remote reply. That too. Thanks! Nigel signature.asc Description: This is a digitally signed message part
Reasons to merge suspend2.
Hi all. I've been working on this email on and off for a while, but since Pavel raised the issue again, I thought I should make a concerted effort to finish it... In this email, I'm going to outline the problems with the current design (uswsusp and swsusp) and the ways in which Suspend2 overcomes those limitations, before going on to outline the additional advantages Suspend2 has for users and address objections previously raised against merging Suspend2. A) Problems with the current design. 1) Ordering of operations. The current [u]swsusp design doesn't do things in discrete, well ordered stages. Storage for the image is not allocated until after the atomic copy has been done. This means that the process can fail when we are a significant portion of the way into suspending, and it means it can fail when the user will seriously expect it to run to completion. The solution to this issue is simple: separate preparing to suspend from actually writing the image. In the preparation step, ensure, so far as you are able, that there will be sufficient memory and sufficient storage to complete the process, and don't write anything or do any atomic copying until after that has been done. The only valid objection I can think of is that you can't know for certain prior to doing the atomic copy how much memory storage will be needed for allocations by driver suspend methods. That can be addressed by a simple extension of the driver model, where in drivers could report how many pages they will need. (If slab will be needed, the worst case can be assumed). Rafael's notify patches (recently posted) also help in that area. Once processes are frozen, all significant memory usage can be accounted for, because the process doing the suspending will be the only one allocating memory. 2) Limit on image size. The current implementation limits the size of an image to an absolute maximum of half the amount of ram. This is certainly an improvement over the old days where it sought to free everything it could, but it's still not good enough. Current memory freeing code doesn't free the exact amount requested; often far more than has been requested is freed. This does not only result in a smaller image. It also means the system is proportionately less responsive on resume at whatever stage that those pages are needed again. A full image is certainly not needed by everyone. Those with huge amounts of memory, very fast storage devices or particular memory usage patterns may, quite rightly, not want to store the whole lot in an image. This doesn't mean, however, that those who want or need (from their perspective) a full image of memory shouldn't be able to have it. It just adds to the argument for making it tunable (which swsusp has done too). 3) Lack of provision for tuning to individual needs. Swsusp historically included very little provision whatsoever for the user to tune their configuration. This has recently begun to change, and I applaud that. But it needs to go further. Suspending to disk is not a one-size-fits-all situation. People have different hardware configurations, with the result being that some people benefit from compression while others do better without it. Some people want encryption in a particular configuration while others don't care about encryption at all. Some people want to limit the image size, others don't. Sometimes a user might want to reboot instead of powering down (dual booting). All of this should be doable, without having to hack the code or recompile the kernel, and should be as simple as possible. Suspend2, via its /sys/power/suspend2 interface and hibernate-script porcelain, makes this easy. 4) No support for multiple swap devices / non swap storage. Until recently, [u]swsusp supported a single swap partition only. Support for a swap file has been added, but [u]swsusp still supports only one swap device at a time. For most people, this is adequate, but this doesn't mean everyone should be forced to fit this mould. [u]swsusp also lacks support for storage to non-swap. Particularly in systems that rely on swap for normal activity, this can make [u]swsusp less reliable. The amount of swap available varies according to workload, so sometimes the user will be unable to suspend. To address this raciness/competition against other swap usage, Suspend2 supports writing to a generic file, either a partition or a file on an ordinary partition. B) Further advantages of Suspend2. == 1) Improvements over swsusp. a) Modular design. Parts of Suspend2 implement support for storing an image in swap or in a file, using cryptoapi for compression and/or encryption and talking to a userspace user interface via a netlink socket. Suspend2 works just fine without CONFIG_SWAP, CONFIG_NET and/or CONFIG_CRYPTOAPI, however, because it uses a modular design wherein support for these subsystems is abstracted
Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
Hi. On Wed, 2007-04-25 at 07:29 +, Pavel Machek wrote: Hi! I absolutely detest all suspend-to-disk crap. Quite frankly, I hate the whole thing. I think they've _all_ caused problems for the true suspend (suspend-to-ram), and the last thing I want to see is three or four different suspend-to-disk implementations. So unlike Ingo, I don't think let's just integrate them all side-by-side and maintain them and look who wins is really a good idea. How many different magic ioctl's does the thing introduce? Is it really just *two* entry-points (and how simple are they, interface-wise), and nothing else? userspace-driven-suspend is already in the kernel, today. So it's not really two versions side by side doing the same thing, but more of: A B C + D E F G H where ABC is used by the uswsusp code today, and ABCDEFGH is used by suspend2. So any suspend2 merge would largely be about adding DEFGH. Actually, we have 'D H' in kernel, today. It is called swsusp... (Encryption, swapFile support and Graphical progress are missing from today's kernel.) Along with a lot of other things (see my Reasons to merge Suspend2 email from earlier in the day). My original mail was about the following thing: i tried the suspend2 patch (which just makes echo disk /sys/power/state work as expected, as long as you give the booting up kernel image an idea about where the ..and it means that 'echo disk ...' should work w/o suspend2 patch, too. (Just try it). You'll miss compression part, but that provides only small speedup. Please don't spread misinformation to support your case. LZF compression (which is what all Suspend2 users use AFAIK) generally doubles the speed of your cycle. Nigel signature.asc Description: This is a digitally signed message part
Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
Hi. On Wed, 2007-04-25 at 10:48 +0200, Xavier Bestel wrote: On Wed, 2007-04-25 at 07:23 +, Pavel Machek wrote: I absolutely detest all suspend-to-disk crap. Quite frankly, I hate the whole thing. I think they've _all_ caused problems for the true suspend (suspend-to-ram), and the last thing I want to see is three or four Well, it is a bit more complex than that. suspend-to-disk is a workaround for 'suspend-to-ram eats too much power' (plus some details like being able to replace battery). suspend-to-ram is a workaround for 'idle machine takes way too much power' (plus some details like don't spin the disk so that machine is safe to carry). I think it depends on who you ask. I personally think that suspend-to- $youchoose is a workaround for the slowness of system startup. I never turn off my laptop, I just suspend it. (And guess what, it uses APM and suspend is really faster and way more reliable than each kernel implementation I could try). If you tried Suspend2 and had problems with reliability, please send me logs. I'll do all I can to help. (I have to qualify it a bit, because I'm not able to fix drivers, but if it's a Suspend2 issue, tell me and I'll fix it). Regards, Nigel signature.asc Description: This is a digitally signed message part
Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
Hi. On Wed, 2007-04-25 at 11:07 +0200, Xavier Bestel wrote: On Wed, 2007-04-25 at 18:50 +1000, Nigel Cunningham wrote: (And guess what, it uses APM and suspend is really faster and way more reliable than each kernel implementation I could try). If you tried Suspend2 and had problems with reliability, please send me logs. I'll do all I can to help. (I have to qualify it a bit, because I'm not able to fix drivers, but if it's a Suspend2 issue, tell me and I'll fix it). Does suspend2 work with APM ? After much trying, I think now the ACPI implementation of my laptop (a vintage Compaq Armada 1700) is busted, only APM works. It should do. If you set the powerdown method to 0, it will use machine_power_off() instead of trying to use acpi, fall back to machine_halt() if that fails and lastly (should not be needed) a while(1) cpu_relax() loop. AFAIR the problem with suspend2 was that it didn't poweroff some parts of the laptop (the led of the wifi pcmcia card was on, and the lcd light was on too), but that was last year. Kernel's suspend kind of worked but didn't resume (no reaction on button press). As I tried all this last year, I may have forgotten some things. The code to poweroff those parts will be dependent on the drivers (assuming I'm making the right calls). If it's something where swsusp works and suspend2 doesn't, it will be because I'm doing something wrong. If they both don't do the right thing, then it's probably the driver. Honestly, I like this laptop when it works flawlessly, so I don't see many reasons to try *susp* again. I'll do it when I'm bored, just not today. Okay :) Just let me know if I can help. Nigel signature.asc Description: This is a digitally signed message part
Re: [PATCH] Use more gcc extensions in the Linux headers
Hi. On Fri, 2007-03-09 at 23:03 -0500, [EMAIL PROTECTED] wrote: On Sat, 10 Mar 2007 09:57:32 +1100, Rusty Russell said: +/* GCC is awesome. */ #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) \ + sizeof(typeof(int[1 - 2*!!__builtin_types_compatible_p(typeof(arr), \ typeof(arr[0]))]))*0) -/* GCC is awesome. */ +/* GCC leaves me speechless. */ A speechless Rusty would be horrible. That said, it would be nice if the comment was something more like the normal Rusty pearl of wisdom. I understand the first part, but have no idea was + sizeof(typeof(int[ does... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Sat, 2007-02-10 at 23:20 +0100, Rafael J. Wysocki wrote: Hi, On Saturday, 10 February 2007 20:38, Pavel Machek wrote: Hi! I don't think this is already done (feel free to correct me if I'm wrong).. Can we start to NAK new drivers that don't have proper power management implemented? There really is no excuse for writing a new driver and not putting .suspend and .resume methods in anymore, is there? to a large degree, a device driver that doesn't suspend is better than no device driver at all, right? now.. if you want to make the core warn about it, that's very fair Well, driver that is broken on SMP is arguably better than no driver at all, yet we'd probably avoid merging that. It would be nice to start including suspend in 'must work' list... What about this: If the device requires that, implement .suspend and .resume or at least define .suspend that will always return -ENOSYS (then people will know they have to unload the driver before the suspend). Similarly, if you aren't sure whether or not the device requires .suspend and .resume, define .suspend that will always return -ENOSYS. If your device requires power management, and you know it requires power management, why not just implement power management? Doing -ENOSYS instead is like saying -ESPAMMEBECAUSEIMLAZY. Let me put it another way: People keep talking about Linux being ready for the desktop. To me at least (but I dare say for lots of other people too), being ready for the desktop means that things just work, without having to recompile kernels or bug driver authors or wait twelve months. And it means that doing a bare minimum isn't enough. We keep claiming that Open Source is better than Proprietary software. If we accept half-pie jobs of implementing support for anything - driver power management support or hibernation support or whatever - as 'good enough', we're undercutting that argument. Linux's power management support should - as far as we're able - be at least as good as that other operating system's and preferably way, way better. -ENOSYS is just not acceptable. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Sun, 2007-02-11 at 22:52 +0100, Willy Tarreau wrote: On Sun, Feb 11, 2007 at 12:31:14PM -0600, Robert Hancock wrote: Willy Tarreau wrote: Nigel, don't take it as a personal offense, but I think it is a very centric view of Linux usages. Where I work, Linux is used a lot on servers and appliances. It is used for mail relays, HTTP proxies, anti-viruses, firewalls, routers, load balancers, UTM, SSH relays, etc... Nobody would ever want to enable power management on those machines, let alone suspend which would cause a major havoc, would the system decide to enter suspend for any reason. Many people also have Linux on their notebooks, but as a dual-boot. You read the word ? dual-boot. It means that they cleanly shutdown their system every time they don't use it anymore, and they won't know what OS they'll use next time. I've never heard anyone there complaining oh, I'm fed up with this boring boot, I always have to wait 30 seconds when I need to do something, I wish I could suspend and resume. It is considered the normal way of using their PCs. I think your experience is rather different than that of Joe Average User who doesn't frequent kernel lists, and also I think you'll find that for a lot of Linux laptop users that don't use supend, the reason is that it doesn't work reliably, quite often due to driver issues. I would believe it if I knew people using suspend/resume on the other OS. But that's not the case either. Also, it happens that with today's RAM sizes, suspend-to-disk then resume can be several times slower than a clean fresh boot. When you have 1 GB to write at 20 MB/s, it takes 50 seconds to shut down, and as much to restart. Compare this to 5-10 seconds for a shutdown and 30-50 seconds for a cold boot, and it might give you another clue why there are people not interested in such a feature. I'm using M$ hibernation and Suspend2 to dual boot on our desktop (dtv card that Linux doesn't support well yet), and I know other Suspend2 users doing the same. It's made earier by the fact that Suspend2 lets you reboot instead of powering down. As to comparing the speed with the time to boot, your estimates are way out. Both will of course vary with the harddrive and cpu speeds and compression qualities of the image, but with Suspend2, I'm seeing speeds more in the range of 40-100MB/s, and even had a resport of 160MB/s a couple of days ago. The rule of thumb I use is: Run hdparm -t (or equiv) on the drive you'll be using: [EMAIL PROTECTED]:~$ sudo hdparm -t /dev/hda /dev/hda: Timing buffered disk reads: 120 MB in 3.02 seconds = 39.70 MB/sec Then calculate RAM_IN_MB / 2 / HDPARM_RESULT = seconds to read/write image. In my case: 1024 / 2 / 39.7 = approx 12 seconds. The / 2 is because with LZF compression, you normally get about 50% compression. I think the mean reason some people aren't interested in suspend to disk is because of myths (if you'll excuse the term) like the one you've put above. Of course that values you give were more accurate for swsusp and uswsusp until recently, but Suspend2 has had async I/O and compression for years, so all I can really do is encourage you to try again. Of course there's another factor you're not taking into account: With suspending to disk, you don't have to close and reopen documents or shut down and restart applications. The time to do that should be factored into the non-suspend-to-disk time to compare apples with apples. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Sun, 2007-02-11 at 00:45 +0100, Tilman Schmidt wrote: Am 10.02.2007 23:37 schrieb Nigel Cunningham: If your device requires power management, and you know it requires power management, why not just implement power management? Doing -ENOSYS instead is like saying -ESPAMMEBECAUSEIMLAZY. Like it or not, power management is far from trivial, and people writing device drivers have limited resources. Calling them lazy does not help that in the least. If you try to put pressure on them by refusing to merge their work as long as it doesn't provide this or that functionality, you *may* end up with a few drivers having that functionality which otherwise wouldn't, but you *will* also end up with a number of drivers never making it into the kernel because their authors just have to give up. It's not that complex. All we're really talking about is a bit of extra code to cleanup and configure hardware state; things that the driver author already knows how to do. S3 might require a bit more initialisation if firmware needs to be reloaded or more extensive configuration needs to be done, but if there's firmware to be loaded, there is a reasonably good probability that we loaded it from Linux to start with anyway. Also, in your argument you neglected a few cases: - What if my device does not require power management? Then you as a generic routine that does nothing but return success (potentially shared with other drivers that are in the same situation). - What if I don't know whether my device requires power management? The questions are straight forward: Is there hardware state that needs to be configured if you've just booted the computer and nothing else has touched it? If so, that needs to be done in a resume method. Do you need to clean up state prior to doing the things in the resume method, or otherwise do things to quiesce the driver? If so, they will need to be done in the suspend method. The result will be roughly similar to what you do for module load/unload, except maybe less complete in some cases. - What if I know my device would require power management, but don't know how to implement it? I've just told you above :) Now you know! Let me put it another way: People keep talking about Linux being ready for the desktop. To me at least (but I dare say for lots of other people too), being ready for the desktop means that things just work, without having to recompile kernels or bug driver authors or wait twelve months. Exactly. And it means that doing a bare minimum isn't enough. We keep claiming that Open Source is better than Proprietary software. If we accept half-pie jobs of implementing support for anything - driver power management support or hibernation support or whatever - as 'good enough', we're undercutting that argument. Linux's power management support should - as far as we're able - be at least as good as that other operating system's and preferably way, way better. -ENOSYS is just not acceptable. Your argument falls down the moment you consider the alternative: not merging the driver means that the device won't work at all. (Given that out-of-tree drivers are actively discouraged, to put it mildly.) That's arguably farther from desktop readiness than a device not supporting power management. I disagree (but I would, of course!). If we apply your logic consistently, we should merge the driver as soon as any code is written for it (anything is better than nothing). I'm simply arguing that a driver that handling suspend and resume should be as much of a requirement as not causing memory corruption or such like are. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Sun, 2007-02-11 at 01:44 +0100, Rafael J. Wysocki wrote: Well, it's probably more acceptable than silently doing nothing and the device failing or locking up the machine on resume, but I couldn't agree more that it's not what we want to be encouraging. Perfect may be the enemy of the good, but works except no power management is hardly what I would call good these days, more like pretty sloppy.. I think there are situations in which it can be justified, like: - The driver is not entirely finished, but we want to merge it early, because of many potential users, - The driver has only a few users who aren't interested in the suspend/resume functionality, How do you determine that? How many users have to want suspend/resume functionality before you say Ok. It has to be done now? - The device is undocumented and we don't know how to make it handle the suspend/resume (we may learn that in the future or not). If we know how to initialise/cleanup, we know a good portion of what is needed for suspend/resume. Sure, for some video chipsets, you need more (you need to know how to reprogram the whole thing after S3), but they're the exception. Yes, there are other cases. But on the whole, we're not talking about esoteric knowledge. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
On Sun, 2007-02-11 at 01:27 +0100, Rafael J. Wysocki wrote: On Sunday, 11 February 2007 00:45, Tilman Schmidt wrote: Am 10.02.2007 23:37 schrieb Nigel Cunningham: If your device requires power management, and you know it requires power management, why not just implement power management? Doing -ENOSYS instead is like saying -ESPAMMEBECAUSEIMLAZY. Like it or not, power management is far from trivial, and people writing device drivers have limited resources. Calling them lazy does not help that in the least. If you try to put pressure on them by refusing to merge their work as long as it doesn't provide this or that functionality, you *may* end up with a few drivers having that functionality which otherwise wouldn't, but you *will* also end up with a number of drivers never making it into the kernel because their authors just have to give up. Also, in your argument you neglected a few cases: - What if my device does not require power management? - What if I don't know whether my device requires power management? - What if I know my device would require power management, but don't know how to implement it? Plus: - What if I'm planning to implement the power managemet, but not just right now? Why not right now? Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Sun, 2007-02-11 at 07:46 +0100, Willy Tarreau wrote: Hi Nigel, On Sun, Feb 11, 2007 at 09:37:06AM +1100, Nigel Cunningham wrote: On Sat, 2007-02-10 at 23:20 +0100, Rafael J. Wysocki wrote: (...) What about this: If the device requires that, implement .suspend and .resume or at least define .suspend that will always return -ENOSYS (then people will know they have to unload the driver before the suspend). Similarly, if you aren't sure whether or not the device requires .suspend and .resume, define .suspend that will always return -ENOSYS. If your device requires power management, and you know it requires power management, why not just implement power management? Doing -ENOSYS instead is like saying -ESPAMMEBECAUSEIMLAZY. No, it means Not implemented because I don't want to screw that driver with something I'm not expert in. And it also means Other people will quickly notice it and will know how to fix this if they really need it. Ok, that was a bit rough. Sorry. At the same time though, we were talking about new drivers. If you know enough to implement the rest of the driver, surely you know enough to implement the power management part too. (See my previous comment about the similarities to module load/unload code). Let me put it another way: People keep talking about Linux being ready for the desktop. To me at least (but I dare say for lots of other people too), being ready for the desktop means that things just work, without having to recompile kernels or bug driver authors or wait twelve months. It's *one* usage of Linux. For this usage, you could also suggest to stop supporting UP kernels and always build everything with SMP enabled since more and more often, people will use multi-core systems. It will exempt the users from upgrading their kernels when they replace their CPU. We could also try to chase down all the drivers which do not correctly behave when the CPU switches to a lower frequency. And it means that doing a bare minimum isn't enough. We keep claiming that Open Source is better than Proprietary software. If we accept half-pie jobs of implementing support for anything - driver power management support or hibernation support or whatever - as 'good enough', we're undercutting that argument. Linux's power management support should - as far as we're able - be at least as good as that other operating system's and preferably way, way better. -ENOSYS is just not acceptable. Nigel, don't take it as a personal offense, but I think it is a very centric view of Linux usages. Where I work, Linux is used a lot on servers and appliances. It is used for mail relays, HTTP proxies, anti-viruses, firewalls, routers, load balancers, UTM, SSH relays, etc... Nobody would ever want to enable power management on those machines, let alone suspend which would cause a major havoc, would the system decide to enter suspend for any reason. I agree. Many people also have Linux on their notebooks, but as a dual-boot. You read the word ? dual-boot. It means that they cleanly shutdown their system every time they don't use it anymore, and they won't know what OS they'll use next time. Not necessarily. I dual boot our desktop machine, and hibernate both, using grub to select with OS to run. I've never heard anyone there complaining oh, I'm fed up with this boring boot, I always have to wait 30 seconds when I need to do something, I wish I could suspend and resume. It is considered the normal way of using their PCs. So globally, those hundreds of notebooks, workstations and servers will not be customers of the suspend code any time soon. It would be a shame to deprive them from working drivers. You must just accept that a lot of people are not interested in your work. It's the same for all of us here. I know that a lot of people are not interested in 2.4 anymore and I'm perfectly fine with that. I'm not asking 2.6 driver authors to ensure that their driver is easy to backport for instance. Neither am I. I'm just asking that new drivers have power management as standard. What I really think would be a clean solution would be sort of a capability. Either the driver *is* suspend/resume-capable, and the system can be suspended. Or it is not, and the system must refuse to suspend. It should not be a problem to proceed like this because drivers which will not support suspend will mainly be those which will not have to. And if a user occasionnaly complains that one driver does not support it, at least you will have a good argument against its author to implement suspend. Yes, but why should the user have to complain to start with? Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Re: NAK new drivers without proper power management?
Hi. On Sun, 2007-02-11 at 12:13 +, Matthew Garrett wrote: On Sun, Feb 11, 2007 at 07:54:04AM +0100, Willy Tarreau wrote: instead of modifying all drivers to explicitly state that they don't support it, we should start with a test of the NULL pointer for .suspend which should mean exactly the same without modifying the drivers. I find it obvious that a driver which does provide a suspend function will not support it. And if some drivers (eg /dev/null) can support it anyway, it's better to change *those* drivers to explicitly mark them as compatible. No, that doesn't work. In the absence of suspend/resume methods, the PCI layer will implement basic PM itself. In some cases, this works. In others, it doesn't. There's no way to automatically determine which is which without modifying the drivers. I think we have it backwards there. Power management support for a driver should always start with the driver itself. If there's a generic routine that can be used for the bus, the driver should explicitly set the routine to the generic routine. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Re: NAK new drivers without proper power management?
Hi. On Sun, 2007-02-11 at 19:53 +0100, Rafael J. Wysocki wrote: Having drivers explicitly marked as to whether they are safe is a good kernel feature; what to do if they're not is policy. That's true, but I assume that the people who opt for doing that are also willing to take part in the review of the drivers. :-) Absolutely :) Well, I don't think so. Let's estimate the number of drivers that define .resume() right now: $ grep -I -l -r '.resume =' linux-2.6.20/drivers/ | wc 102 1024169 I think the '.resume =' doesn't help - some have tabs. I ran '\.resume' and got 351. It would be interesting to see how many struct pci_driver etc instances lack resume methods. Regards, Nige - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Sun, 2007-02-11 at 21:02 +, Alan wrote: If the device requires that, implement .suspend and .resume or at least define .suspend that will always return -ENOSYS (then people will know they have to unload the driver before the suspend). Similarly, if you aren't sure whether or not the device requires .suspend and .resume, define .suspend that will always return -ENOSYS. Sounds ok to me. Where should this text go? Documentation/SubmittingDrivers ? And testing/submitting drivers, perhaps with additional text in that to make it clear we want suspend/resume support or good excuses Please verify your driver correctly handles suspend and resume. If it does not your patch submission is likely to be suspended and only resume when the driver correctly handles this feature Maybe make it explicit that testing should be done for both suspend to ram and to disk, and with the following usage scenarios as applicable? - built in; - modular, loaded while suspending but not loaded prior to resume from disk; - modular, loaded while suspending and loaded prior to resume from disk; Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Sun, 2007-02-11 at 23:46 +0100, Willy Tarreau wrote: On Sun, 2007-02-11 at 22:52 +0100, Willy Tarreau wrote: On Sun, Feb 11, 2007 at 12:31:14PM -0600, Robert Hancock wrote: Willy Tarreau wrote: Nigel, don't take it as a personal offense, but I think it is a very centric view of Linux usages. Where I work, Linux is used a lot on servers and appliances. It is used for mail relays, HTTP proxies, anti-viruses, firewalls, routers, load balancers, UTM, SSH relays, etc... Nobody would ever want to enable power management on those machines, let alone suspend which would cause a major havoc, would the system decide to enter suspend for any reason. Many people also have Linux on their notebooks, but as a dual-boot. You read the word ? dual-boot. It means that they cleanly shutdown their system every time they don't use it anymore, and they won't know what OS they'll use next time. I've never heard anyone there complaining oh, I'm fed up with this boring boot, I always have to wait 30 seconds when I need to do something, I wish I could suspend and resume. It is considered the normal way of using their PCs. I think your experience is rather different than that of Joe Average User who doesn't frequent kernel lists, and also I think you'll find that for a lot of Linux laptop users that don't use supend, the reason is that it doesn't work reliably, quite often due to driver issues. I would believe it if I knew people using suspend/resume on the other OS. But that's not the case either. Also, it happens that with today's RAM sizes, suspend-to-disk then resume can be several times slower than a clean fresh boot. When you have 1 GB to write at 20 MB/s, it takes 50 seconds to shut down, and as much to restart. Compare this to 5-10 seconds for a shutdown and 30-50 seconds for a cold boot, and it might give you another clue why there are people not interested in such a feature. I'm using M$ hibernation and Suspend2 to dual boot on our desktop (dtv card that Linux doesn't support well yet), and I know other Suspend2 users doing the same. It's made earier by the fact that Suspend2 lets you reboot instead of powering down. As to comparing the speed with the time to boot, your estimates are way out. Both will of course vary with the harddrive and cpu speeds and compression qualities of the image, but with Suspend2, I'm seeing speeds more in the range of 40-100MB/s, and even had a resport of 160MB/s a couple of days ago. The rule of thumb I use is: Run hdparm -t (or equiv) on the drive you'll be using: [EMAIL PROTECTED]:~$ sudo hdparm -t /dev/hda /dev/hda: Timing buffered disk reads: 120 MB in 3.02 seconds = 39.70 MB/sec Then calculate RAM_IN_MB / 2 / HDPARM_RESULT = seconds to read/write image. In my case: 1024 / 2 / 39.7 = approx 12 seconds. The / 2 is because with LZF compression, you normally get about 50% compression. I think the mean reason some people aren't interested in suspend to disk is because of myths (if you'll excuse the term) like the one you've put above. Of course that values you give were more accurate for swsusp and uswsusp until recently, but Suspend2 has had async I/O and compression for years, so all I can really do is encourage you to try again. Well, I agree that you give some good arguments here. Of course there's another factor you're not taking into account: With suspending to disk, you don't have to close and reopen documents or shut down and restart applications. The time to do that should be factored into the non-suspend-to-disk time to compare apples with apples. Hmm sorry, but we don't have the same usages of notebooks. For no reason would I keep documents open, for two reasons : - when I shutdown my notebook, it is to move from one customer to home/company/another customer. There's no related work anyway, the network will have changed and I'll have to switch nearly all of my apps anyway. So using suspend just to save one reboot is not worth it (for me) IMHO. The network configuration utilities can help there. In addition, Suspend2 preserves the commandline you used to boot with (/sys/power/suspend2/resume_commandline), so you can use a combination of slightly varying grub entries (I have one for not starting ath0 and one for starting it) and scripts to do different things in different environments. The resume_commandline is writable, so can be cleared after usage if there were anything sensitive there. - I would certainly not keep open documents that are on crypted FS while I travel. Otherwise, it would be a total waste of time to enter my passphrase everytime I need to access them ! Some might argue that it would save me a lot of time, providing me with the ability to type my passphrase only once a month, but that's not
Re: NAK new drivers without proper power management?
Hi. On Mon, 2007-02-12 at 02:57 +0400, Manu Abraham wrote: On 2/12/07, Nigel Cunningham [EMAIL PROTECTED] wrote: Neither am I. I'm just asking that new drivers have power management as standard. What if the hardware doesn't support power management ? You would still want to do the cleanup and configuration that you'd do for module load/unload. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Mon, 2007-02-12 at 00:16 +0100, Rafael J. Wysocki wrote: On Monday, 12 February 2007 00:10, Nigel Cunningham wrote: Hi. On Sun, 2007-02-11 at 21:02 +, Alan wrote: If the device requires that, implement .suspend and .resume or at least define .suspend that will always return -ENOSYS (then people will know they have to unload the driver before the suspend). Similarly, if you aren't sure whether or not the device requires .suspend and .resume, define .suspend that will always return -ENOSYS. Sounds ok to me. Where should this text go? Documentation/SubmittingDrivers ? And testing/submitting drivers, perhaps with additional text in that to make it clear we want suspend/resume support or good excuses Please verify your driver correctly handles suspend and resume. If it does not your patch submission is likely to be suspended and only resume when the driver correctly handles this feature Maybe make it explicit that testing should be done for both suspend to ram and to disk, and with the following usage scenarios as applicable? - built in; - modular, loaded while suspending but not loaded prior to resume from disk; - modular, loaded while suspending and loaded prior to resume from disk; I think we should state the general rule in Documentation/SubmittingDrivers and give more details in Documentation/power/devices.txt Sounds good. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Mon, 2007-02-12 at 00:21 +0100, Pavel Machek wrote: Hi! define .suspend that will always return -ENOSYS (then people will know they have to unload the driver before the suspend). Similarly, if you aren't sure whether or not the device requires .suspend and .resume, define .suspend that will always return -ENOSYS. Sounds ok to me. Where should this text go? Documentation/SubmittingDrivers ? And testing/submitting drivers, perhaps with additional text in that to make it clear we want suspend/resume support or good excuses Please verify your driver correctly handles suspend and resume. If it does not your patch submission is likely to be suspended and only resume when the driver correctly handles this feature Maybe make it explicit that testing should be done for both suspend to ram and to disk, and with the following usage scenarios as applicable? Well, for many people s2ram does not work even today... so requiring them to test it is slightly draconian. - built in; - modular, loaded while suspending but not loaded prior to resume from disk; These two should be equivalent. No. The differences are: Built in: The initcalls will have run, but the driver may or may not actually have been used, depending on whether it's used before we start the resume. It should probably be tested with both having been used and not having been used. Modular, loaded prior to suspending but not prior to resuming: At resume time, will still be in whatever config the bios puts it in. No Linux driver code will have touched it. Modular, loaded prior to suspending and resuming: Should be equivalent to built in (module initcalls will have run), but may vary if there's some difference in code/timing between being a module and built in. (This shouldn't happen, but that's the point to testing). - modular, loaded while suspending and loaded prior to resume from disk; Ok.. but I'm not sure how many people will actually test it _that_ thoroughly. Try to test it is good enough for a first version. When suspend is in better shape, we can ask for more. I'd prefer to ask for what should be done from the start. Will we expect people to go back and retest if we change the rules, or do we prefer them to complain You didn't adequately point out the testing I needed to do, and I got all these complaints from my users! Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Mon, 2007-02-12 at 03:25 +0400, Manu Abraham wrote: On 2/12/07, Nigel Cunningham [EMAIL PROTECTED] wrote: Hi. On Mon, 2007-02-12 at 02:57 +0400, Manu Abraham wrote: On 2/12/07, Nigel Cunningham [EMAIL PROTECTED] wrote: Neither am I. I'm just asking that new drivers have power management as standard. What if the hardware doesn't support power management ? You would still want to do the cleanup and configuration that you'd do for module load/unload. By adding dummy functions, wouldn't that just look awkward ? If all you need to do is say 'I don't need to do anything' and we have a shared function that does that, all we're talking about doing is adding to your struct pci_device (or whatever) .resume = generic_empty_resume; To me at least, that doesn't look awkward, and says cleanly and clearly that you've checked things over and decided you know what's required. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Mon, 2007-02-12 at 00:29 +0100, Rafael J. Wysocki wrote: On Sun, 2007-02-11 at 01:44 +0100, Rafael J. Wysocki wrote: Well, it's probably more acceptable than silently doing nothing and the device failing or locking up the machine on resume, but I couldn't agree more that it's not what we want to be encouraging. Perfect may be the enemy of the good, but works except no power management is hardly what I would call good these days, more like pretty sloppy.. I think there are situations in which it can be justified, like: - The driver is not entirely finished, but we want to merge it early, because of many potential users, - The driver has only a few users who aren't interested in the suspend/resume functionality, How do you determine that? How many users have to want suspend/resume functionality before you say Ok. It has to be done now? That depends on what the driver author tells us. If he says there's only one such device in the world and it needs a Linux drivers, but the system in question will never be suspended, that will be fine, I think. There are such cases already and I see no reason why there won't be any more in the future. - The device is undocumented and we don't know how to make it handle the suspend/resume (we may learn that in the future or not). If we know how to initialise/cleanup, we know a good portion of what is needed for suspend/resume. Sure, for some video chipsets, you need more (you need to know how to reprogram the whole thing after S3), but they're the exception. Yes, there are other cases. But on the whole, we're not talking about esoteric knowledge. No, in general this is not _that_ simple. Please browse the archives of bcm43xx-dev, for example. Yeah. The problems of not having documentation + having to reassociate and so on. While I agree that the support for suspend and resume _is_ generally important, I also admit that there are situations in which it doesn't matter and there are many people who won't care a whit for it. Ok, but that's the exception, right? Not the rule? So in those cases, an exception is made. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Mon, 2007-02-12 at 00:38 +0100, Willy Tarreau wrote: On Mon, Feb 12, 2007 at 10:18:42AM +1100, Nigel Cunningham wrote: [snip] Hmm sorry, but we don't have the same usages of notebooks. For no reason would I keep documents open, for two reasons : - when I shutdown my notebook, it is to move from one customer to home/company/another customer. There's no related work anyway, the network will have changed and I'll have to switch nearly all of my apps anyway. So using suspend just to save one reboot is not worth it (for me) IMHO. The network configuration utilities can help there. In addition, Suspend2 preserves the commandline you used to boot with (/sys/power/suspend2/resume_commandline), so you can use a combination of slightly varying grub entries (I have one for not starting ath0 and one for starting it) and scripts to do different things in different environments. The resume_commandline is writable, so can be cleared after usage if there were anything sensitive there. OK, I see there are features to make life easier when I decide to use suspend. But it looks like that using suspend is the goal and dealing with the constraints is a lot of work and I'm still far from being convinced that it would provide me advantage. Ok. I don't feel like I have to convince everyone :) - I would certainly not keep open documents that are on crypted FS while I travel. Otherwise, it would be a total waste of time to enter my passphrase everytime I need to access them ! Some might argue that it would save me a lot of time, providing me with the ability to type my passphrase only once a month, but that's not what I'm looking for :-) People are using Suspend2 with encryption today (I'm not sure about uswsusp). Some of them have set things up so you need to use a passphrase or usb key to resume, and the image itself is of course encrypted too. Unless I'm mistaken, I have to type the passphrase twice then : - once at suspend - once at resume which is once more per boot than what I'm doing on loop-aes. I'm not sure. I don't use encryption myself, so I don't understand all the fine details. I just know that there are people out there using encryption, loop-aes, dmsetup and all that sort of stuff. I don't have to worry about it because they use an initrd/ramfs to do whatever they need to do to provide access to the device on which the image is found, then echo /dev/whatever_funny_device /sys/power/suspend2/resume2 echo /sys/power/suspend2/do_resume You could also close the document and not the app. Or both and just get the benefit of having the app in page cache post-resume. I'm not much convinced by the advantage of reading 500 MB on disk to have emacs in hot cache :-) Neither am I! Presumably you'd have a lot more than emacs in there though :) You could always switch to vim! (*ducks*) Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Mon, 2007-02-12 at 00:41 +0100, Rafael J. Wysocki wrote: I'm using M$ hibernation and Suspend2 to dual boot on our desktop (dtv card that Linux doesn't support well yet), and I know other Suspend2 users doing the same. It's made earier by the fact that Suspend2 lets you reboot instead of powering down. Well, I don't know why you're saying it's a special capability of suspend2. Even the old swsusp has been able to do this since I can remember. ;-) It does?! I just did cat /sys/power/disk and it only says platform. How do you make swsusp reboot instead of powering down? Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Mon, 2007-02-12 at 00:50 +0100, Rafael J. Wysocki wrote: On Monday, 12 February 2007 00:47, Nigel Cunningham wrote: Hi. On Mon, 2007-02-12 at 00:41 +0100, Rafael J. Wysocki wrote: I'm using M$ hibernation and Suspend2 to dual boot on our desktop (dtv card that Linux doesn't support well yet), and I know other Suspend2 users doing the same. It's made earier by the fact that Suspend2 lets you reboot instead of powering down. Well, I don't know why you're saying it's a special capability of suspend2. Even the old swsusp has been able to do this since I can remember. ;-) It does?! I just did cat /sys/power/disk and it only says platform. How do you make swsusp reboot instead of powering down? echo reboot /sys/power/disk echo disk /sys/power/state Ah. Perhaps you should make it show reboot when you cat it? Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Mon, 2007-02-12 at 01:09 +0100, Rafael J. Wysocki wrote: On Monday, 12 February 2007 00:55, Nigel Cunningham wrote: Hi. On Mon, 2007-02-12 at 00:50 +0100, Rafael J. Wysocki wrote: On Monday, 12 February 2007 00:47, Nigel Cunningham wrote: Hi. On Mon, 2007-02-12 at 00:41 +0100, Rafael J. Wysocki wrote: I'm using M$ hibernation and Suspend2 to dual boot on our desktop (dtv card that Linux doesn't support well yet), and I know other Suspend2 users doing the same. It's made earier by the fact that Suspend2 lets you reboot instead of powering down. Well, I don't know why you're saying it's a special capability of suspend2. Even the old swsusp has been able to do this since I can remember. ;-) It does?! I just did cat /sys/power/disk and it only says platform. How do you make swsusp reboot instead of powering down? echo reboot /sys/power/disk echo disk /sys/power/state Ah. Perhaps you should make it show reboot when you cat it? albercik:~ # echo reboot /sys/power/disk albercik:~ # cat /sys/power/disk reboot It shows the current value, and platform happens to be the default now. Oh, so the problem is that it shows the current value, not the possibilities. I wrongly assumed it would work like /sys/power/disk. That explains it :) Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Howdy! On Mon, 2007-02-12 at 01:10 +0100, Tilman Schmidt wrote: Hi, Am 11.02.2007 23:37 schrieb Nigel Cunningham: On Sun, 2007-02-11 at 00:45 +0100, Tilman Schmidt wrote: Am 10.02.2007 23:37 schrieb Nigel Cunningham: If your device requires power management, and you know it requires power management, why not just implement power management? [...] Like it or not, power management is far from trivial, and people writing device drivers have limited resources. [...] It's not that complex. All we're really talking about is a bit of extra code to cleanup and configure hardware state; things that the driver author already knows how to do. S3 might require a bit more initialisation if firmware needs to be reloaded or more extensive configuration needs to be done, but if there's firmware to be loaded, there is a reasonably good probability that we loaded it from Linux to start with anyway. You are assuming a perfect world where driver authors have complete knowledge of their devices. In reality, many drivers (including those I have the mixed pleasure of maintaining) are based at least in part on reverse engineering, and managing power states may well fall into the domain of things not yet sufficiently reverse engineered. Nope. I'm assuming that the driver author knows what needs to be done to get the driver out of whatever state the BIOS puts it in to start with, and into an operational state, and that they therefore also know what needs to be done to take it out of the operational state again. I'm admitting that there's also another state - the post suspend-to-ram driver state - that they may not know how to deal with. But for suspend-to-disk, if you know how to get the driver to work in the first place, you know enough to stop it working (.suspend) and start it up again (.resume) for the hibernate case at least. I'm not assuming that you know enough to be able to put the driver into a low state and get it out again. This is definitely preferable, and at least possibly essential for suspend to ram, but for some unknown reason I'm quite hibernation focused, and for that, just the above is sufficient. Also, in your argument you neglected a few cases: - What if my device does not require power management? Then you as a generic routine that does nothing but return success (potentially shared with other drivers that are in the same situation). But if I just write an empty routine like that I open myself up to criticism along the lines of writing dummy routines just in order to shut up kernel warnings. BTDT. Well, it might not be completely empty. I think someone already pointed out that there's a minimal workset for the pci bus that pci drivers would want to invoke. But we wouldn't (rightly) accuse you of such things if we decided that the policy was Every driver ought to have a resume routine, even if it's just a minimal I-just-work route. - What if I don't know whether my device requires power management? The questions are straight forward: Is there hardware state that needs to be configured if you've just booted the computer and nothing else has touched it? If so, that needs to be done in a resume method. Do you need to clean up state prior to doing the things in the resume method, or otherwise do things to quiesce the driver? If so, they will need to be done in the suspend method. The result will be roughly similar to what you do for module load/unload, except maybe less complete in some cases. I don't doubt your basic assessment. However it doesn't translate that easily into a real implementation. In my case, I maintain a USB driver, so I have to deal with USB specifics of suspend/resume which happen not to be that well documented. My driver provides an isdn4linux device but isdn4linux knows nothing about suspend/resume so I am on my own on how to reconcile the two. The device itself, though in turn far from trivial, is actually the least of my worries. Mmm, so that's a case where we need to prod those who write documentation and bus support first. You're probably closer! :) Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA-performance: Linux vs. FreeBSD
Hi Alan et al. On Mon, 2007-02-12 at 19:08 +, Alan wrote: I'm not sure you'll get 50MB/sec sustained to work although you might with a good current drive used for nothing else, a linear stream of data (no seeking and file system overhead), and a non PCI controller (PCI Express, host chipset bus etc). That's Suspend2's usage pattern when given a whole partition, so I can state without reservation you can get maximum throughput under those circumstances, even with a PCI controller. Swsusp should do about the same too. Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Mon, 2007-02-12 at 16:57 +0100, Geert Uytterhoeven wrote: On Mon, 12 Feb 2007, Pavel Machek wrote: Can't the upper layer just assume -ENOSYS if .resume/.suspend is NULL? It's nicer if you don't have to implement dummy functions at all. Unfortunately, drivers currently assume NULL == nothing is needed, so we'd have t do big search replace... Which means you also cannot easily keep track of which driver supports suspend/resume and which doesn't, as there will always be drivers where a missing suspend/resume function is correct. Wouldn't it be more sensible to have .suspend = suspend_nothing_to_do instead, and reserve NULL for `not yet implemented'? Agreed. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Mon, 2007-02-12 at 22:01 +0100, Rafael J. Wysocki wrote: On Monday, 12 February 2007 21:58, Pavel Machek wrote: Hi! If all you need to do is say 'I don't need to do anything' and we have a shared function that does that, all we're talking about doing is adding to your struct pci_device (or whatever) .resume = generic_empty_resume; To me at least, that doesn't look awkward, and says cleanly and clearly that you've checked things over and decided you know what's required. Actually, I'd like it to be .resume = generic_empty_resume; /* Explain, why your driver needs no resume */ Okay, but we can't define an empty .resume(), because, for example, the PCI's generic suspend/resume won't be called. PCI drivers should just do .resume = pci_generic_resume, explicitely. Well, I generally agree, but I think the idea with the pm_safe flag has some advantages. For example, the drivers that do deffine .suspend() and .resume() which don't work correctly could be flagged as not pm_safe until the problems are fixed. Oooh. Now I like that idea. Are you thinking of a document in Documentation/power that describes why pm_safe is off, or comments in the code itself? Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Mon, 2007-02-12 at 06:19 +0100, Willy Tarreau wrote: One less myth as Nigel would say call it ;-) You know me too well! : - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: NAK new drivers without proper power management?
Hi. On Mon, 2007-02-12 at 21:06 +0100, Rafael J. Wysocki wrote: On Monday, 12 February 2007 05:08, Nigel Cunningham wrote: Nope. I'm assuming that the driver author knows what needs to be done to get the driver out of whatever state the BIOS puts it in to start with, and into an operational state, and that they therefore also know what needs to be done to take it out of the operational state again. I'm admitting that there's also another state - the post suspend-to-ram driver state - that they may not know how to deal with. But for suspend-to-disk, if you know how to get the driver to work in the first place, you know enough to stop it working (.suspend) and start it up again (.resume) for the hibernate case at least. We're talking about _both_ the STR and STD. The drivers that have problems with the STR cannot be regarded as suspend/resume-safe IMO. Yeah, I'm not disagreeing at all. I'm just admitting my bias toward the bit I concentrate on more. [...] Mmm, so that's a case where we need to prod those who write documentation and bus support first. You're probably closer! :) Actually, the lack of documentation is a major problem that we all should try to fix in the first place. Unfortunately the code has been recently changing quite often, so that's difficult. Yeah. Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] PM: Document requirements for basic PM support in drivers
Hi. On Tue, 2007-02-13 at 00:23 +0100, Rafael J. Wysocki wrote: Hi, Here's my attempt to document the requirements with respect to the basic PM support in drivers and the testing of that. Comments welcome. Greetings, Rafael --- Documentation/SubmittingDrivers | 10 ++ Documentation/power/drivers-testing.txt | 119 2 files changed, 129 insertions(+) Index: linux-2.6.20-git4/Documentation/SubmittingDrivers === --- linux-2.6.20-git4.orig/Documentation/SubmittingDrivers +++ linux-2.6.20-git4/Documentation/SubmittingDrivers @@ -87,6 +87,16 @@ Clarity: It helps if anyone can see how driver that intentionally obfuscates how the hardware works it will go in the bitbucket. +PM support: Since Linux is used on many portable and desktop systems, your + driver is likely to be used on such a system and therefore it + should support basic power management by implementing, if + necessary, the .suspend and .resume methods used during the + system-wide suspend and resume transitions. You should verify + that your driver correctly handles the suspend and resume, but + if you are unable to ensure that, please at least define the + .suspend method returning the -ENOSYS (Function not + implemented) error. + Control: In general if there is active maintainance of a driver by the author then patches will be redirected to them unless they are totally obvious and without need of checking. Index: linux-2.6.20-git4/Documentation/power/drivers-testing.txt === --- /dev/null +++ linux-2.6.20-git4/Documentation/power/drivers-testing.txt @@ -0,0 +1,119 @@ +Testing suspend and resume support in drivers + (C) 2007 Rafael J. Wysocki [EMAIL PROTECTED] + +Unfortunately, to effectively test the support for the system-wide suspend and +resume transitions in a driver, it is necessary to suspend and resume a fully +functional system with this driver loaded. Moreover, that should be done many +times, preferably many times in a row, and separately for the suspend to disk +(STD) and the suspend to RAM (STR) transitions, because each of these cases +involves different ordering of operations and different interactions with the +machine's BIOS. + +Of course, for this purpose the test system has to be known to suspend and +resume without the driver being tested. Thus, if possible, you should first +resolve all suspend/resume-related problems in the test system before you start +testing the new driver. + +I. Preparing the test system + +1. To verify that the STD works, you can try to suspend in the reboot mode: + +# echo reboot /sys/power/disk +# echo disk /sys/power/state + +and the system should suspend, reboot, resume and get back to the command prompt +where you have started the transition. If that happens, the STD is most likely +to work correctly, but you can repeat the test a couple of times in a row for +confidence. You should also test the platform and shutdown modes of I would say you need to repeat the test at least a couple of times..., perhaps adding something along the lines of This is necessary because some problems only show up on a second attempt at suspending and resuming a driver. You can think of it as the driver coming back 'dazed and confused' after the first cycle, and only being properly killed by the second attempt. +suspend: + +# echo platform /sys/power/disk +# echo disk /sys/power/state + +or + +# echo shutdown /sys/power/disk +# echo disk /sys/power/state + +in which cases you will have to press the power button to make the system +resume. If that works, you are ready to test the STD with the new driver +loaded. Otherwise, you have to identify what is wrong. + +a) To verify if there are any drivers that cause problems you can run the STD +in the test mode: + +# echo test /sys/power/disk +# echo disk /sys/power/state + +in which case the system should freeze tasks, suspend devices, disable nonboot +CPUs (if any), wait for 5 seconds, enable nonboot CPUs, resume devices, thaw +tasks and return to your command prompt. If that fails, most likely there is +a driver that fails to either suspend or resume (in the latter case the system +may hang or be unstable after the test, so please take that into consideration). +To find this driver, you can carry out a binary search according to the rules: +- if the test fails, unload a half of the drivers currently loaded and repeat +(that would probably involve rebooting the system, so always note what drivers +have been loaded before the test), +- if the test succeeds, load a half of the drivers you have unloaded most
Re: [linux-pm] 2.6.21-rc4-mm1: freezing of processes broken
Hi. On Tue, 2007-03-20 at 19:23 -0600, Eric W. Biederman wrote: Rafael J. Wysocki [EMAIL PROTECTED] writes: On Tuesday, 20 March 2007 22:06, Rafael J. Wysocki wrote: On Tuesday, 20 March 2007 21:58, Jiri Slaby wrote: Rafael J. Wysocki napsal(a): Actually, the problem is 100% reproducible on my system too and I doubt it's caused by the recent freezer patches. I don't know what exactly do you mean by recent, but 2.6.21-rc3-mm2 works for me. Thanks for the confirmation. The patches I was talking about had already been in 2.6.21-rc3-mm2, so the reason of this failure must be different. Bisection shows that the freezing of processes has been broken by one of the patches: remove-the-likelypid-check-in-copy_process.patch Grr. Oleg's review of remove-the-likelypid-check-in-copy-process showed it to be questionable (and it was just an optimization) so we can get rid of that one easily. Although all it did that was really questionable was add the idle process to the global process list and bump a process count when we forked the idle process. Not dramatically dangerous things. use-task_pgrp-task_session-in-copy_process.patch As I recall that patch was pretty trivial, and shouldn't have anything to do with the freezer. The process freezer doesn't care about pids does it? Could the freezer code be trying to freeze the idle thread as a result? Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Code reordering in swsusp breaks suspend on SMP systems
Hi. On Wed, 2007-03-21 at 18:40 +0200, Maxim Levitsky wrote: Hi, Starting with 2.6.21-rc1 suspend to ram and disk doesn't work anymore on my system. I did a git-bisect and found that those commits break it: e3c7db621bed4afb8e231cb005057f2feb5db557 - [PATCH] [PATCH] PM: Change code ordering in main.c ed746e3b18f4df18afa3763155972c5835f284c5 - [PATCH] [PATCH] swsusp: Change code ordering in disk.c 259130526c267550bc365d3015917d90667732f1 - [PATCH] [PATCH] swsusp: Change code ordering in user.c I already reported about it, but now i know the reason why suspend breaks. The problem is that both cpu_up/cpu_down were allowed to sleep until now, and it did work because those functions could be called only in process context (the one that writes to /sys/devices/system/cpu/cpu*/online) or idle thread that does smp_init()). But now they are called _after_ all tasks were suspended, so if cpu_down tries for example to take a lock that is taken by different process, it can't since the different proccess is frozen and can't release the lock. I tested this and all results are positive: I disabled 2nd cpu by hand, and then suspend to ram was successfull. Suspend to disk went correctly, but it hang on resume, and I know why. It hang in old kernel trying to disable 2nd cpu that was enabled by it. I was able using kdb to confirm that this is true because it was still possible to enter kdb, and see that idle thread (swapper) was active, and uswsusp was waiting on mutex inside workqueue_cpu_callback. The solution for this problem seems to be ether complete audit of code that uses register_cpu_notifier, to ensure that it doesn't sleep. Also documentation should be changed to note about it. Or, it is also possible to revert this change. Do you know exactly which mutex was being waited on and where it was taken? If you can say that, it would be much more helpful. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] Code reordering in swsusp breaks suspend on SMP systems
Hi. On Wed, 2007-03-21 at 22:38 +0100, Rafael J. Wysocki wrote: Do you know exactly which mutex was being waited on and where it was taken? If you can say that, it would be much more helpful. Yeah, me too, but assuming too much sometimes bites me :) I think this is the XFS problem with freezable workqueues. Maxim, please try to apply the appended patch and see if it helps. Thanks for your subsequent messages, Maxim. Could you confirm for us that the patch Rafael attached fixes it? Regards, Nigel --- Since freezable workqueues are broken in 2.6.21-rc (cf. http://marc.theaimsgroup.com/?l=linux-kernelm=116855740612755, http://marc.theaimsgroup.com/?l=linux-kernelm=117261312523921w=2) it's better to remove them altogether for 2.6.21 and change the only user of them (XFS) accordingly. --- fs/xfs/linux-2.6/xfs_buf.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Index: linux-2.6.21-rc4/fs/xfs/linux-2.6/xfs_buf.c === --- linux-2.6.21-rc4.orig/fs/xfs/linux-2.6/xfs_buf.c +++ linux-2.6.21-rc4/fs/xfs/linux-2.6/xfs_buf.c @@ -1829,11 +1829,11 @@ xfs_buf_init(void) if (!xfs_buf_zone) goto out_free_trace_buf; - xfslogd_workqueue = create_freezeable_workqueue(xfslogd); + xfslogd_workqueue = create_workqueue(xfslogd); if (!xfslogd_workqueue) goto out_free_buf_zone; - xfsdatad_workqueue = create_freezeable_workqueue(xfsdatad); + xfsdatad_workqueue = create_workqueue(xfsdatad); if (!xfsdatad_workqueue) goto out_destroy_xfslogd_workqueue; - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/8] Enhance process freezer interface for usage beyond software suspend
Hi. On Fri, 2007-04-06 at 16:34 +0200, Rafael J. Wysocki wrote: On Monday, 2 April 2007 22:51, Pavel Machek wrote: Hi! +/* Per process freezer specific flags */ +#define PF_FE_SUSPEND0x8000 /* This thread should not be frozen + * for suspend + */ + +#define PF_FE_KPROBES0x0010 /* This thread should not be frozen + * for Kprobes + */ Just put the comment before the define for long comments? Agreed. (Actually it would be nice to say /* This thread should not be frozen for suspend, becuase it is needed for getting image saved to disk */ -#ifdef CONFIG_PM +#if defined(CONFIG_PM) || defined(CONFIG_HOTPLUG_CPU) || \ + defined(CONFIG_KPROBES) Should we create CONFIG_FREEZER? Why do you think so? I think the freezer should be compiled automatically if any of the above is set, which is what this directive really means. Kconfig can do that. (select statement). If we have one such ifdef, it is okay, but if it would be more of them. Hmmm, I do not really like softlockup watchdog running during suspend. Can we make this freezeable and make watchdog shut itself off while suspending? Generally, I agree, but this patch only replaces the existing instances of PF_NOFREEZE with the new mechanism. The changes you're talking about require a separate patch series (or at least one separate patch), I think, and they need not be so simple to make. Agreed about separate patch series. - current-flags |= PF_NOFREEZE; + freezer_exempt(FE_ALL); pid = kernel_thread(do_linuxrc, /linuxrc, SIGCHLD); if (pid 0) { while (pid != sys_wait4(-1, NULL, 0, NULL)) Does this mean we have userland /linuxrc running with PF_NOFREEZE? That would be very bad... No, actually it is _required_ for the userland resume to work. Well, perhaps I should place a comment in there so that I don't have to explain this again and again. :-) Can you put big bold comment there? Why is it needed? Freezer never freezes _current_ task. No, it doesn't, but this task spawns linuxrc and then calls sys_wait4() in a loop. Well, actually, I'll try to plant try_to_freeze() in this loop and see if that works. If it doesn't, I'll add a comment. It works. I've had: while (pid != sys_wait4(-1, NULL, 0, NULL)) { yield(); try_to_freeze(); } there for ages for Suspend2. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/8] Use process freezer for cpu-hotplug
Hi. On Fri, 2007-04-06 at 12:47 -0500, Nathan Lynch wrote: Ingo Molnar wrote: * Nathan Lynch [EMAIL PROTECTED] wrote: - raw_notifier_call_chain(cpu_chain, CPU_LOCK_ACQUIRE, hcpu); + if (freeze_processes(FE_HOTPLUG_CPU)) { + thaw_processes(FE_HOTPLUG_CPU); + return -EBUSY; + } + If I'm understanding correctly, this will cause # echo 0 /sys/devices/system/cpu/cpuX/online to sometimes fail, and userspace is expected to try again? This will break existing applications. Perhaps drivers/base/cpu.c:store_online should retry as long as cpu_up/down return -EBUSY. That would avoid a userspace-visible interface change. yeah. I'd even suggest a freeze_processes_nofail() API instead, that does this internally, without burdening the callsites. (and once the freezer becomes complete then freeze_processes_nofail() == freeze_processes()) Yeah, I just realized that an implementation of my proposal would busy loop in the kernel forever if a silly admin tried to offline the last cpu (we're already using -EBUSY for that case), so freeze_processes_nofail is a better idea :-) If there's only one online cpu, shouldn't it return -EINVAL? Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: USB: on suspend to ram/disk all usb devices are replugged
Hi. On Mon, 2007-04-02 at 21:36 +0200, Pavel Machek wrote: Hi! But you're still likely to run into trouble if you unplug a storage device, move it to another system and write on it, then plug it back into the original system. The PLVM would somehow have to recognize that the data had been changed. I don't know a foolproof way of doing that. Mark the filesystem as in-use with a one-time UUID in the superblock at mount time. If one moved the drive to another system it would require an fsck to clear the UUID before the other system could use it; then the original machine would refuse to use the drive when the UUID didn't match on resume. You still need fs-specific code, I'm afraid... plus userland tool to reset signatures back. You don't need userland to reset the signatures. More kernel code, sure. But it doesn't _need_ to be userland. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/8] Enhance process freezer interface for usage beyond software suspend
Hi. On Sat, 2007-04-07 at 11:33 +0200, Rafael J. Wysocki wrote: On Saturday, 7 April 2007 00:20, Nigel Cunningham wrote: - current-flags |= PF_NOFREEZE; + freezer_exempt(FE_ALL); pid = kernel_thread(do_linuxrc, /linuxrc, SIGCHLD); if (pid 0) { while (pid != sys_wait4(-1, NULL, 0, NULL)) Does this mean we have userland /linuxrc running with PF_NOFREEZE? That would be very bad... No, actually it is _required_ for the userland resume to work. Well, perhaps I should place a comment in there so that I don't have to explain this again and again. :-) Can you put big bold comment there? Why is it needed? Freezer never freezes _current_ task. No, it doesn't, but this task spawns linuxrc and then calls sys_wait4() in a loop. Well, actually, I'll try to plant try_to_freeze() in this loop and see if that works. If it doesn't, I'll add a comment. It works. I've had: while (pid != sys_wait4(-1, NULL, 0, NULL)) { yield(); try_to_freeze(); } there for ages for Suspend2. OK, thanks. Is there any particular reason to place try_to_freeze() after yield()? Not that I remember. I haven't touched that for years :) Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] freezer: Remove PF_NOFREEZE from handle_initrd
Hi again. By the way, I'm stopping using [EMAIL PROTECTED]; could you please change your address book to nigel at nigel dot suspend2 dot net? Thanks! Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] freezer: Remove PF_NOFREEZE from handle_initrd
Hi. On Sat, 2007-04-07 at 18:14 +0200, Rafael J. Wysocki wrote: From: Rafael J. Wysocki [EMAIL PROTECTED] Make handle_initrd() call try_to_freeze() in a suitable place instead of setting PF_NOFREEZE for the current task. Signed-off-by: Rafael J. Wysocki [EMAIL PROTECTED] --- init/do_mounts_initrd.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) Index: linux-2.6.21-rc6/init/do_mounts_initrd.c === --- linux-2.6.21-rc6.orig/init/do_mounts_initrd.c +++ linux-2.6.21-rc6/init/do_mounts_initrd.c @@ -55,11 +55,12 @@ static void __init handle_initrd(void) sys_mount(., /, NULL, MS_MOVE, NULL); sys_chroot(.); - current-flags |= PF_NOFREEZE; pid = kernel_thread(do_linuxrc, /linuxrc, SIGCHLD); if (pid 0) { - while (pid != sys_wait4(-1, NULL, 0, NULL)) + while (pid != sys_wait4(-1, NULL, 0, NULL)) { + try_to_freeze(); yield(); + } } /* move initrd to rootfs' /old */ ACK. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm] swsusp: Use rbtree for tracking allocated swap
Hi. On Sat, 2007-04-07 at 15:06 -0700, Andrew Morton wrote: On Sat, 7 Apr 2007 23:20:39 +0200 Rafael J. Wysocki [EMAIL PROTECTED] wrote: This should allow us to reduce the memory usage, practically always, and improve performance. And does it? It will. I've been using extents for ages, for the same reasons. I don't put them in an rb_tree because I view it as less than most efficient, but it will still be a huge step forward from bitmaps in the normal case. The worst case would be if every second page of swap was in use, so that you needed one extent per swap page. In that case, it would use more memory than the bitmap, but far, far more common will be the case where only one extent is needed for the whole swap partition, because the algorithm used by the swap allocator minimises fragmentation. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm] swsusp: Use rbtree for tracking allocated swap
Hi. On Sun, 2007-04-08 at 01:13 +0200, Rafael J. Wysocki wrote: On Sunday, 8 April 2007 00:31, Nigel Cunningham wrote: Hi. On Sat, 2007-04-07 at 15:06 -0700, Andrew Morton wrote: On Sat, 7 Apr 2007 23:20:39 +0200 Rafael J. Wysocki [EMAIL PROTECTED] wrote: This should allow us to reduce the memory usage, practically always, and improve performance. And does it? Yes. There are theoretical corner cases in which it may be less efficient than the current approach, but in the usual situation it is _much_ better. It will. I've been using extents for ages, for the same reasons. I don't put them in an rb_tree because I view it as less than most efficient, Actually, I don't agree with that. In the normal situation (ie. one extent is needed) there is no difference as far as the memory usage or performance are concerned, but if there are more extents, the rbtree should be more efficient. I don't think it's worth having a big discussion over, but let me give you the details, which you can then feel free to ignore :) The rb_node struct adds an unsigned long and two struct rb_node * pointers. My extents use one struct extent * pointer. The difference is thus 12/24 bytes per extent (32/64 bits) vs 20/40. In the normal situation, not worth worrying about, but I'm also using these for recording the sectors we write too, and thinking about swap files and multiple swap devices. Nearly double the memory use bites more as you get more extents. Insertion cost for rb_node includes keeping the tree balanced. For extents, I start with the location of the last insertion to minimise the cost, so insertion time is usually virtually zero (inc max of last extent or append a new one). If for some reason swap was allocated out of order, I might need to traverse the whole chain from the start. Normal usage in both cases is simply iterating through the list, so I guess the cost would be approximately the same. Deletion could would include rebalancing for the rb_nodes. Code cost is a gain for you - you're leveraging existing code, I'm adding a bit more. extent.c is 300 lines including code for serialising the chains in an image header and iterating through a group of chains (multiple swap devices support). rb_nodes seem to be the wrong solution to me because we generally don't care about searching. We care about minimising memory usage and maximising the speed of iteration, insertion and deletion. I believe I've managed to do that with a singly linked, sorted list. That said, we've agreed that we're normally talking about a small number of extents, so it's probably not worth the bandwidth I've already spent :) Regards, Nigel signature.asc Description: This is a digitally signed message part
Re: [RFC][PATCH -mm] swsusp: Use rbtree for tracking allocated swap
Hi. On Sun, 2007-04-08 at 18:47 +0200, Rafael J. Wysocki wrote: On Sunday, 8 April 2007 01:42, Nigel Cunningham wrote: Hi. On Sun, 2007-04-08 at 01:13 +0200, Rafael J. Wysocki wrote: On Sunday, 8 April 2007 00:31, Nigel Cunningham wrote: Hi. On Sat, 2007-04-07 at 15:06 -0700, Andrew Morton wrote: On Sat, 7 Apr 2007 23:20:39 +0200 Rafael J. Wysocki [EMAIL PROTECTED] wrote: This should allow us to reduce the memory usage, practically always, and improve performance. And does it? Yes. There are theoretical corner cases in which it may be less efficient than the current approach, but in the usual situation it is _much_ better. It will. I've been using extents for ages, for the same reasons. I don't put them in an rb_tree because I view it as less than most efficient, Actually, I don't agree with that. In the normal situation (ie. one extent is needed) there is no difference as far as the memory usage or performance are concerned, but if there are more extents, the rbtree should be more efficient. I don't think it's worth having a big discussion over, but let me give you the details, which you can then feel free to ignore :) The rb_node struct adds an unsigned long and two struct rb_node * pointers. My extents use one struct extent * pointer. The difference is thus 12/24 bytes per extent (32/64 bits) vs 20/40. Well, you use open-coded lists. If you used list.h lists, the numbers would be different. :-) Yes, but I don't need doubly linked lists. In the normal situation, not worth worrying about, but I'm also using these for recording the sectors we write too, and thinking about swap files and multiple swap devices. Nearly double the memory use bites more as you get more extents. Insertion cost for rb_node includes keeping the tree balanced. For extents, I start with the location of the last insertion to minimise the cost, so insertion time is usually virtually zero (inc max of last extent or append a new one). Isn't the appending one actually linear worst-case? Worst case would be the swap allocator returning swap pages in reverse order. As you and I both know, that doesn't happen. I first implemented this in 2003. If the worst case actually happened, I would have seen the effect by now :) If for some reason swap was allocated out of order, I might need to traverse the whole chain from the start. Exactly. Normal usage in both cases is simply iterating through the list, so I guess the cost would be approximately the same. Deletion could would include rebalancing for the rb_nodes. In swsusp the deletions are needed only if there's an error. When freeing swap at the end of the cycle? Code cost is a gain for you - you're leveraging existing code, I'm adding a bit more. extent.c is 300 lines including code for serialising the chains in an image header and iterating through a group of chains (multiple swap devices support). rb_nodes seem to be the wrong solution to me because we generally don't care about searching. We care about minimising memory usage and maximising the speed of iteration, insertion and deletion. I believe I've managed to do that with a singly linked, sorted list. The insertion also uses searching and in fact I don't really care for anything else. Ok :) Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm] swsusp: Use rbtree for tracking allocated swap
Hi. On Mon, 2007-04-09 at 15:03 +0200, Rafael J. Wysocki wrote: On Sunday, 8 April 2007 23:07, Nigel Cunningham wrote: [--snip--] Normal usage in both cases is simply iterating through the list, so I guess the cost would be approximately the same. Deletion could would include rebalancing for the rb_nodes. In swsusp the deletions are needed only if there's an error. When freeing swap at the end of the cycle? That depends on what you mean by 'the end'. :-) We free swap if the image saving fails only, since it's allocated after we've created the image. After the resume, the state of swap from before the image creation is the current one anyway. Ah, of course. I forgot that temporarily. Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: mconf not removed by make mrproper
Hi. On Sun, 2007-04-01 at 23:17 +0200, Sam Ravnborg wrote: On Thu, Feb 01, 2007 at 02:05:49PM +1100, Nigel Cunningham wrote: Hi. The scripts/kconfig/mconf target isn't removed by the make mrproper target. I can see a couple of possibilities, but wasn't sure which you'd prefer, so thought I'd just raise the issue. It's only an issue for me because my patch generation script relies on make mrproper making a properly clean tree. Fixed - thanks. Sam Works fine here; thanks! Acked-by: Nigel Cunningham [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc5: swsusp: Not enough free memory
Hi. On Fri, 2007-04-13 at 14:00 +0200, Rafael J. Wysocki wrote: Shrinking memory... Pages needed: 128103 normal, 0 highmem Pages needed: 125226 normal, 0 highmem Pages needed: -5757 normal, 0 highmem Pages needed: -5757 normal, 0 highmem Pages needed: -5757 normal, 0 highmem Pages needed: -5757 Pages needed: 127953 normal, 0 highmem Pages needed: 125076 normal, 0 highmem Pages needed: -6043 normal, 0 highmem Pages needed: -6043 normal, 0 highmem Pages needed: -6043 normal, 0 highmem Pages needed: -6043 done (200 pages freed) Freed 800 kbytes in 0.16 seconds (5.00 MB/s) Suspending console(s) ... CPU1 is down swsusp: critical section: swsusp: Need to copy 131358 pages swsusp: Normal pages needed: 131358 swsusp: Normal pages needed: 131358 + 1024 + 22, available pages: 130607 Well, it looks like someone allocated about 6000 pages after we had freed enough memory for suspending. We have a tunable allowance in Suspend2 for this, because fglrx allocates a lot of pages in its suspend routine if DRI is enabled. I think some other drivers do too, but fglrx is the main one I know. Nigel signature.asc Description: This is a digitally signed message part
Re: [RFD] swsusp problem: Drivers allocate much memory during suspend (was: Re: 2.6.21-rc5: swsusp: Not enough free memory)
Hi. On Fri, 2007-04-13 at 22:41 +0200, Rafael J. Wysocki wrote: On Friday, 13 April 2007 14:21, Nigel Cunningham wrote: Hi. On Fri, 2007-04-13 at 14:00 +0200, Rafael J. Wysocki wrote: Shrinking memory... Pages needed: 128103 normal, 0 highmem Pages needed: 125226 normal, 0 highmem Pages needed: -5757 normal, 0 highmem Pages needed: -5757 normal, 0 highmem Pages needed: -5757 normal, 0 highmem Pages needed: -5757 Pages needed: 127953 normal, 0 highmem Pages needed: 125076 normal, 0 highmem Pages needed: -6043 normal, 0 highmem Pages needed: -6043 normal, 0 highmem Pages needed: -6043 normal, 0 highmem Pages needed: -6043 done (200 pages freed) Freed 800 kbytes in 0.16 seconds (5.00 MB/s) Suspending console(s) ... CPU1 is down swsusp: critical section: swsusp: Need to copy 131358 pages swsusp: Normal pages needed: 131358 swsusp: Normal pages needed: 131358 + 1024 + 22, available pages: 130607 Well, it looks like someone allocated about 6000 pages after we had freed enough memory for suspending. We have a tunable allowance in Suspend2 for this, because fglrx allocates a lot of pages in its suspend routine if DRI is enabled. I think some other drivers do too, but fglrx is the main one I know. I wasn't aware of that, thanks for the information. I think this means we'll probably need to add a tunable, similar to image_size, that will allow the users to specify how much spare memory they want to reserve for suspending (instead of the constant PAGES_FOR_IO). IMO we can call it 'spare_memory'. Still, this doesn't look like a real solution, because it would require the users affected by this problem to experiment with suspending in order to figure out how much spare memory they will need. IMO to really fix the problem, we should let the drivers that need much memory for suspending allocate it _before_ the memory shrinker is called. For this purpose we can use notifiers that will be called before we start the shrinking of memory. Namely, if a driver needs to allocate substantial amount of memory for suspending, it can register a notifier that will be called before we try to shrink memory. Then, the memory needed by the driver may be allocated in this notifier (of course, in that case it will also have to be called if the shrinking of memory fails, so that the memory allocated by the driver for suspending can be freed) and used in the driver's .suspend() routine. Comments welcome. Yeah. I've thought about it too. It could also be good for that acpi routine that was allocating memory during in an atomic context with the wrong flagas. Another idea that occurred to me would be to allow drivers to have a routine saying how much memory they will need, which we could call to calculate the allowance we need. Personally, I think the notifier chain is simpler and preferable :) Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFD] swsusp problem: Drivers allocate much memory during suspend (was: Re: 2.6.21-rc5: swsusp: Not enough free memory)
Hi. On Sat, 2007-04-14 at 00:10 +0200, Pavel Machek wrote: Hi! Well, it looks like someone allocated about 6000 pages after we had freed enough memory for suspending. We have a tunable allowance in Suspend2 for this, because fglrx allocates a lot of pages in its suspend routine if DRI is enabled. I think some other drivers do too, but fglrx is the main one I know. I wasn't aware of that, thanks for the information. I think this means we'll probably need to add a tunable, similar to image_size, that will allow the users to specify how much spare memory they want to reserve for suspending (instead of the constant PAGES_FOR_IO). IMO we can call it 'spare_memory'. Just increase PAGES_FOR_IO. This should not be tunable. If we don't have a means for drivers to pre-allocate or say how much memory they need, it should be tunable. Frankly, I'm startled that you guys haven't heard of this issue before now. I can't believe everyone who has ever wanted to hibernate with DRM enabled has been using Suspend2. Maybe this is one of the sources of complaints that swsusp isn't reliable? IMO to really fix the problem, we should let the drivers that need much memory for suspending allocate it _before_ the memory shrinker is called. For this purpose we can use notifiers that will be called before we start the shrinking of memory. Namely, if a driver needs to allocate substantial amount of memory Yes please. Using that notifier without leaking the memory will be interesting but if someone needs so much memory during suspend, let them eat their own complexity. It doesn't need to be that complex. Add another (optional) function to the driver model to let drivers say how much they want and it becomes trivial. Maybe this idea should be preferred over the notifier chain. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFD] swsusp problem: Drivers allocate much memory during suspend (was: Re: 2.6.21-rc5: swsusp: Not enough free memory)
Hi. On Sat, 2007-04-14 at 00:35 +0200, Rafael J. Wysocki wrote: On Saturday, 14 April 2007 00:10, Pavel Machek wrote: Hi! Well, it looks like someone allocated about 6000 pages after we had freed enough memory for suspending. We have a tunable allowance in Suspend2 for this, because fglrx allocates a lot of pages in its suspend routine if DRI is enabled. I think some other drivers do too, but fglrx is the main one I know. I wasn't aware of that, thanks for the information. I think this means we'll probably need to add a tunable, similar to image_size, that will allow the users to specify how much spare memory they want to reserve for suspending (instead of the constant PAGES_FOR_IO). IMO we can call it 'spare_memory'. Just increase PAGES_FOR_IO. This should not be tunable. Well, I'm not sure. First, we don't really know what the value of it should be and this alone is a good enough reason for making it tunable, IMHO. Second, I think different systems may need different PAGES_FOR_IO and taking just the maximum (even if we learn how much that actually is) seems to be wasteful in the vast majority of cases. Finally, I think it may be possible to speed up image saving by increasing PAGES_FOR_IO without playing with the image size and we can let the user try it (think of distro kernels that are compiled for many different users). It does vary according to the amount of video memory used for DRM, if I understand correctly. Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-pm] [RFD] swsusp problem: Drivers allocate much memory during suspend (was: Re: 2.6.21-rc5: swsusp: Not enough free memory)
Hi. On Sat, 2007-04-14 at 00:38 +0200, Pavel Machek wrote: Hi! Well, it looks like someone allocated about 6000 pages after we had freed enough memory for suspending. We have a tunable allowance in Suspend2 for this, because fglrx allocates a lot of pages in its suspend routine if DRI is enabled. I think some other drivers do too, but fglrx is the main one I know. I wasn't aware of that, thanks for the information. I think this means we'll probably need to add a tunable, similar to image_size, that will allow the users to specify how much spare memory they want to reserve for suspending (instead of the constant PAGES_FOR_IO). IMO we can call it 'spare_memory'. Just increase PAGES_FOR_IO. This should not be tunable. If we don't have a means for drivers to pre-allocate or say how much memory they need, it should be tunable. Frankly, I'm startled that you guys haven't heard of this issue before now. I can't believe everyone who has ever wanted to hibernate with DRM enabled has been using Suspend2. Maybe this is one of the sources of complaints that swsusp isn't reliable? We do not support closed-source drivers, and open-source drivers are well behaved. I didn't say fglrx was the only example. Any system using DRI (not DRM, sorry), would, I think, be expected. I just mention fglrx because I have a Radeon 200M that can only use fglrx for Beryl etc at the mo - it's the one I'm familiar with. IMO to really fix the problem, we should let the drivers that need much memory for suspending allocate it _before_ the memory shrinker is called. For this purpose we can use notifiers that will be called before we start the shrinking of memory. Namely, if a driver needs to allocate substantial amount of memory Yes please. Using that notifier without leaking the memory will be interesting but if someone needs so much memory during suspend, let them eat their own complexity. It doesn't need to be that complex. Add another (optional) function to the driver model to let drivers say how much they want and it becomes trivial. Maybe this idea should be preferred over the notifier chain. Actually, it is trivial to prealocate during boot ;-). As the notifier chain can be useful for other stuff, too, I'd go that way. Pavel! Talk sense! You're not seriously suggesting squirreling away 35 megabytes of a user's memory at boot just because they might want to hibernate with DRI enabled later? Yes, 35 megabytes is a realistic amount. Regards, Nigel signature.asc Description: This is a digitally signed message part
Re: [RFD] swsusp problem: Drivers allocate much memory during suspend (was: Re: 2.6.21-rc5: swsusp: Not enough free memory)
Hi. On Sat, 2007-04-14 at 00:40 +0200, Pavel Machek wrote: Hi! Well, it looks like someone allocated about 6000 pages after we had freed enough memory for suspending. We have a tunable allowance in Suspend2 for this, because fglrx allocates a lot of pages in its suspend routine if DRI is enabled. I think some other drivers do too, but fglrx is the main one I know. I wasn't aware of that, thanks for the information. I think this means we'll probably need to add a tunable, similar to image_size, that will allow the users to specify how much spare memory they want to reserve for suspending (instead of the constant PAGES_FOR_IO). IMO we can call it 'spare_memory'. Just increase PAGES_FOR_IO. This should not be tunable. Well, I'm not sure. First, we don't really know what the value of it should be and this alone is a good enough reason for making it tunable, IMHO. Second, I think different systems may need different PAGES_FOR_IO and taking just the maximum (even if we learn how much that actually is) seems to be wasteful in Well, it is wasteful as in we save slightly smaller image than we could. That's okay with me. No. If the driver can't allocate the memory, your call to device_suspend will fail. This isn't about image size but about success or failure to hibernate. Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFD] swsusp problem: Drivers allocate much memory during suspend (was: Re: 2.6.21-rc5: swsusp: Not enough free memory)
Hi. On Sat, 2007-04-14 at 00:57 +0200, Rafael J. Wysocki wrote: Well, I'm not sure. First, we don't really know what the value of it should be and this alone is a good enough reason for making it tunable, IMHO. Second, I think different systems may need different PAGES_FOR_IO and taking just the maximum (even if we learn how much that actually is) seems to be wasteful in Well, it is wasteful as in we save slightly smaller image than we could. That's okay with me. No. If the driver can't allocate the memory, your call to device_suspend will fail. This isn't about image size but about success or failure to hibernate. If we take PAGES_FOR_IO to be the maximum over all possible configurations that can hibernate, the majority of systems will just create smaller images than they could have created for smaller PAGES_FOR_IO, but all of them will be able to hibernate. :-) You also use PAGES_FOR_IO in enough_free_mem. Say you set it to the 9000 pages I mentioned before (35M). On a machine with 64 megabytes of memory, you'll never be able to suspend because you'll never satisfy free nr_pages + PAGES_FOR_IO + meta I'll freely admit that 64 megabytes is tiny nowadays, but it's not completely unknown. The point is really that you're effectively making swsusp unusable for machines with RAM (PAGES_FOR_IO * (say) 3). But what do you set PAGES_FOR_IO to? There'll always be someone with $WHIZ_BANG_CONFIG who is pushing to have the value increased, and every increase knocks out more of your lowend users. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with freezable workqueues
Hi. On Wed, 2007-02-28 at 01:08 +0100, Rafael J. Wysocki wrote: On Wednesday, 28 February 2007 01:01, Johannes Berg wrote: On Wed, 2007-02-28 at 00:57 +0100, Rafael J. Wysocki wrote: Okay, in that case I'd suggest removing create_freezeable_workqueue() and make all workqueues nonfreezable once again for 2.6.21 (as far as I know, only the two XFS workqueues are affected). I think Nigel might object but I forgot what specific trouble XFS was causing him. We suspected that the XFS' worker threads might commit I/O after freeze_processes() has returned, but that hasn't been supported by evidence, as far as I can recall. Also, making them freezable was controversial ... Controversy is no reason to give in! Nevertheless, I think you're right - I believe the XFS guys said they fixed the issue that had caused I/O to be submitted post-freeze. Well, we'll see if it appears again, won't we? Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Resume from S2R fails after dpm_resume()
Hi. On Fri, 2007-03-02 at 07:25 -0700, Tim Gardner wrote: Pavel Machek wrote: Hi! I instrumented 2.6.21-rc1 base/power/resume.c device_resume() with TRACE_RESUME(0) as the last statement in the function. Sure enough it was the last hash value in the RTC after a hard reboot when resume failed: [ 12.028820] hash matches drivers/base/power/resume.c:104 The machine appears to be absolutely wedged after initiating resume by pressing the power button. The disk flashes for a half second or so, then thats it. It is a Dell XPS, BIOS rev A04. I'm using 'echo 1 /sys/power/pm_trace; echo mem /sys/power/state' to initiate the S2R sequence. Any suggestions on where to go from here? Did it work ok in 2.6.20? Can you try to get video working/get serial console/something? Pavel Pavel, The last version that worked well was Ubuntu Edgy (2.6.17). It was broken by 2.6.18. I have not started the 'git bisect' process, instead I've been trying to figure out why it doesn't work in 2.6.21-rc2. Using the TRACE_RESUME macro I've drilled down to kernel/printk.c:__call_console_drivers. So far the last trace info that I have is just before the call to con-write(). I'm trying to figure out what driver has registered as the console (intel_agp or agpgart?). Am I banging my head on a known problem? Tim, it's possible that the problem you're seeing is completely different to the one Pavel is looking for. Given that you're down to looking in console write code, I wonder if it's related to the changes to console suspending that were done around that time. I'd suggest either looking in LKML or Linux-PM archives for a commit related to suspending the console, or doing your git bisect. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-pm] [PATCH 2/2 -stable] libata: add missing CONFIG_PM in LLDs
Hi. On Fri, 2007-03-02 at 17:46 +0900, Tejun Heo wrote: Add missing #ifdef CONFIG_PM conditionals around all PM related parts in libata LLDs. Signed-off-by: Tejun Heo [EMAIL PROTECTED] --- drivers/ata/ahci.c | 14 ++ drivers/ata/ata_generic.c |4 drivers/ata/ata_piix.c |4 drivers/ata/pata_ali.c |6 ++ drivers/ata/pata_amd.c |6 ++ drivers/ata/pata_atiixp.c |4 drivers/ata/pata_cmd64x.c |6 ++ drivers/ata/pata_cs5520.c |7 +++ drivers/ata/pata_cs5530.c |6 ++ drivers/ata/pata_cs5535.c |4 drivers/ata/pata_cypress.c |4 drivers/ata/pata_efar.c |4 drivers/ata/pata_hpt366.c |7 ++- drivers/ata/pata_hpt3x3.c |6 ++ drivers/ata/pata_it821x.c |6 ++ drivers/ata/pata_jmicron.c |4 drivers/ata/pata_marvell.c |4 drivers/ata/pata_mpiix.c|4 drivers/ata/pata_netcell.c |4 drivers/ata/pata_ns87410.c |4 drivers/ata/pata_oldpiix.c |4 drivers/ata/pata_opti.c |4 drivers/ata/pata_optidma.c |4 drivers/ata/pata_pdc202xx_old.c |4 drivers/ata/pata_radisys.c |4 drivers/ata/pata_rz1000.c |6 ++ drivers/ata/pata_sc1200.c |4 drivers/ata/pata_serverworks.c |6 ++ drivers/ata/pata_sil680.c |4 drivers/ata/pata_sis.c |4 drivers/ata/pata_triflex.c |4 drivers/ata/pata_via.c |6 ++ drivers/ata/sata_sil.c |2 ++ drivers/ata/sata_sil24.c|2 ++ 34 files changed, 165 insertions(+), 1 deletion(-) Index: work1/drivers/ata/ahci.c === --- work1.orig/drivers/ata/ahci.c +++ work1/drivers/ata/ahci.c @@ -225,10 +225,12 @@ static void ahci_thaw(struct ata_port *a static void ahci_error_handler(struct ata_port *ap); static void ahci_vt8251_error_handler(struct ata_port *ap); static void ahci_post_internal_cmd(struct ata_queued_cmd *qc); +#ifdef CONFIG_PM static int ahci_port_suspend(struct ata_port *ap, pm_message_t mesg); static int ahci_port_resume(struct ata_port *ap); static int ahci_pci_device_suspend(struct pci_dev *pdev, pm_message_t mesg); static int ahci_pci_device_resume(struct pci_dev *pdev); Wouldn't it be simpler to add #else #define ahci_port_suspend(port, message) (NULL) etc (or something similar)? Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-pm] [PATCH 2/2 -stable] libata: add missing CONFIG_PM in LLDs
Hi. On Sat, 2007-03-03 at 12:20 +0900, Tejun Heo wrote: Hello, Nigel. Nigel Cunningham wrote: Index: work1/drivers/ata/ahci.c === --- work1.orig/drivers/ata/ahci.c +++ work1/drivers/ata/ahci.c @@ -225,10 +225,12 @@ static void ahci_thaw(struct ata_port *a static void ahci_error_handler(struct ata_port *ap); static void ahci_vt8251_error_handler(struct ata_port *ap); static void ahci_post_internal_cmd(struct ata_queued_cmd *qc); +#ifdef CONFIG_PM static int ahci_port_suspend(struct ata_port *ap, pm_message_t mesg); static int ahci_port_resume(struct ata_port *ap); static int ahci_pci_device_suspend(struct pci_dev *pdev, pm_message_t mesg); static int ahci_pci_device_resume(struct pci_dev *pdev); Wouldn't it be simpler to add #else #define ahci_port_suspend(port, message) (NULL) etc (or something similar)? ahci_port_suspend() is used to fill ata_port_ops vector, so it needs to be a function. If you're talking about defining NULL function, yeah, that will remove half of CONFIG_PMs but would require dummy definitions for all functions. I think both are ugly. :-) Yeah, I didn't look really carefully; an empty static function would have been what I'd have written if I'd paid more attention. I'm working on a linker trick. Please take a look at the following thread. http://thread.gmane.org/gmane.linux.ide/16475 Not familiar with fancy things like that, so I'll just pipe down and leave you to it :). Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with freezable workqueues
Hi. On Tue, 2007-03-06 at 21:31 +0100, Rafael J. Wysocki wrote: Hi, On Tuesday, 6 March 2007 01:30, Johannes Berg wrote: On Tue, 2007-02-27 at 22:51 +0100, Rafael J. Wysocki wrote: For 2.6.21-rc1 I've invented the appended workaround (works for me, waiting for Johannes to confirm it works for him too), but I think we need something better for -mm and future kernels. Finally I could get back to this but after reading the thread I figured it might not be necessary to test this. Please let me know ASAP if you want this patch tested as well or it'll take quite a long time (going skiing for a week on Saturday) I think it won't be necessary. For now, we have decided to make the workqueues nonfreezable (the patch for that has already been merged, AFAICT). In any case, I made the two xfs workqueues non-freezable and everything on my quad powermac works again, I also couldn't detect any filesystem correction. Good, thanks for the confirmation. I wanted to adapt the BUG_ON(block IO not from suspend code) patch from suspend2 but haven't gotten around to it yet. That might be a good idea for other reasons too, but I'd prefer WARN_ON() instead of BUG_ON() when you're at it. ;-) I made it BUG_ON() because if Suspend2 is running any I/O coming from another source besides Suspend2 may be I/O on a page that's been used for the atomic copy, and in that case it would definitely be bad to write it to disk. If swsusp is running, the BUG_ON() won't trigger IIRC. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Radeon xpress 200m and radeonfb kinda work
Hi. On Tue, 2007-03-06 at 01:16 +0100, Johan Henriksson wrote: Hi! I have gotten the radeon xpress 200m (the version without dedicated vmem) to work with radeonfb. The attached patch (against linux-2.6.20.1) works for me. Since I don't have any docs for the card I am unsure if the patch is 100% correct. Can someone else with a 200m try it out? (I have tested it by enabling fbcon and radeonfb in the kernel and added video=radeonfb to lilo. This gave me a nice 1280x800 console :) ) /Johan Henriksson Please CC, I'm not on the list. @@ -2329,7 +2332,7 @@ static int __devinit radeonfb_pci_regist /* -2 is special: means ON on mobility chips and do not * change on others */ - radeonfb_pm_init(rinfo, rinfo-is_mobility ? 1 : -1, ignore_devlist, force_sleep); + radeonfb_pm_init(rinfo, -1,ignore_devlist, force_sleep);//rinfo-is_mobility ? 1 : -1); That looks like it might break !200M. Maybe something line rinfo-is_mobility !rinfo-rs480 (with additional modifications to define an rs480, of course) - or a more generic name indicating why the rs480 is different? Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/20] x86_64 Relocatable bzImage support (V4)
Hi. On Wed, 2007-03-07 at 07:07 -0800, Arjan van de Ven wrote: On Wed, 2007-03-07 at 12:27 +0530, Vivek Goyal wrote: Hi, Here is another attempt on x86_64 relocatable bzImage patches(V4). This patchset makes a bzImage relocatable and same kernel binary can be loaded and run from different physical addresses. have these patches been extensively tested with various suspend scenarios? (S1,S3,S4 in acpi speak or s2ram and s2disk in Linux speak) We did work on this for RHEL5, getting relocatable kernel support working fine with S4. While doing it and since, I've been running Suspend2 with the same patch. Since that work, Vivek has done more modifications, but I can confirm that the basic design is reliable with S4. Haven't tried S3, but can do. Will report back shortly. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/20] x86_64 Relocatable bzImage support (V4)
Hi. On Thu, 2007-03-08 at 07:49 +1100, Nigel Cunningham wrote: Hi. On Wed, 2007-03-07 at 07:07 -0800, Arjan van de Ven wrote: On Wed, 2007-03-07 at 12:27 +0530, Vivek Goyal wrote: Hi, Here is another attempt on x86_64 relocatable bzImage patches(V4). This patchset makes a bzImage relocatable and same kernel binary can be loaded and run from different physical addresses. have these patches been extensively tested with various suspend scenarios? (S1,S3,S4 in acpi speak or s2ram and s2disk in Linux speak) We did work on this for RHEL5, getting relocatable kernel support working fine with S4. While doing it and since, I've been running Suspend2 with the same patch. Since that work, Vivek has done more modifications, but I can confirm that the basic design is reliable with S4. Haven't tried S3, but can do. Will report back shortly. S3 works okay here with a relocatable x86_64 kernel (2.6.20). Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 16/20] swsusp: do not use virt_to_page on kernel data address
Hi. On Wed, 2007-03-07 at 23:50 +0100, Pavel Machek wrote: Hi! o virt_to_page() call should be used on kernel linear addresses and not on kernel text and data addresses. Swsusp code uses it on kernel data (statically allocated swsusp_header). o Allocate swsusp_header dynamically so that virt_to_page() can be used safely. o I am changing this because in next few patches, __pa() on x86_64 will no longer support kernel text and data addresses and hibernation breaks. Signed-off-by: Vivek Goyal [EMAIL PROTECTED] (I assume this was tested, too?) Absolutely. Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/20] x86_64 Relocatable bzImage support (V4)
Hi. On Thu, 2007-03-08 at 10:10 +0530, Vivek Goyal wrote: On Thu, Mar 08, 2007 at 10:15:02AM +1100, Nigel Cunningham wrote: Hi. On Thu, 2007-03-08 at 07:49 +1100, Nigel Cunningham wrote: Hi. On Wed, 2007-03-07 at 07:07 -0800, Arjan van de Ven wrote: On Wed, 2007-03-07 at 12:27 +0530, Vivek Goyal wrote: Hi, Here is another attempt on x86_64 relocatable bzImage patches(V4). This patchset makes a bzImage relocatable and same kernel binary can be loaded and run from different physical addresses. have these patches been extensively tested with various suspend scenarios? (S1,S3,S4 in acpi speak or s2ram and s2disk in Linux speak) We did work on this for RHEL5, getting relocatable kernel support working fine with S4. While doing it and since, I've been running Suspend2 with the same patch. Since that work, Vivek has done more modifications, but I can confirm that the basic design is reliable with S4. Haven't tried S3, but can do. Will report back shortly. S3 works okay here with a relocatable x86_64 kernel (2.6.20). Hi Nigel, Is it possible to test S3 with 2.6.21-rc2 kernels also. Right now I don't have access to any machine supporting S3. I tested it at the time of my last posting and it had worked well. Appreciate your help. Tested with rc3 (rc2 wouldn't compile), and it works fine. If you're willing, please add Signed-off-by: Nigel Cunningham [EMAIL PROTECTED] or Acked-by: Nigel Cunningham [EMAIL PROTECTED] to the hibernation related parts as you see appropriate, since I helped (albeit in a minor way compared to your work and Eric's work) with preparing and testing them for RHEL5 and have confirmed they're still ok in this version. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 7/9] unprivileged mounts: allow unprivileged fuse mounts
Hi. Miklos Szeredi wrote: On Tue 2008-01-08 12:35:09, Miklos Szeredi wrote: From: Miklos Szeredi [EMAIL PROTECTED] Use FS_SAFE for fuse fs type, but not for fuseblk. FUSE was designed from the beginning to be safe for unprivileged users. This has also been verified in practice over many years. In addition unprivileged Eh? So 'kill -9 no longer works' and 'suspend no longer works' is not considered important enough to even mention? No. Because in practice they don't seem to matter. Also because there's no way in which fuse could be done differently to address these issues. Could you clarify, please? I hope I'm getting the wrong end of the stick - it sounds to me like you and Pavel are saying that this patch breaks suspending to ram (and hibernating?) but you want to push it anyway because you haven't been able to produce an instance, don't think suspending or hibernating matter and couldn't fix fuse anyway? The 'kill -9' thing is basically due to VFS level locking not being interruptible. It could be changed, but I'm not sure it's worth it. For the suspend issue, there are also no easy solutions. What are the non-easy solutions? Regards, Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 7/9] unprivileged mounts: allow unprivileged fuse mounts
Hi. Miklos Szeredi wrote: On Tue 2008-01-08 12:35:09, Miklos Szeredi wrote: From: Miklos Szeredi [EMAIL PROTECTED] Use FS_SAFE for fuse fs type, but not for fuseblk. FUSE was designed from the beginning to be safe for unprivileged users. This has also been verified in practice over many years. In addition unprivileged Eh? So 'kill -9 no longer works' and 'suspend no longer works' is not considered important enough to even mention? No. Because in practice they don't seem to matter. Also because there's no way in which fuse could be done differently to address these issues. Could you clarify, please? I hope I'm getting the wrong end of the stick - it sounds to me like you and Pavel are saying that this patch breaks suspending to ram (and hibernating?) but you want to push it anyway because you haven't been able to produce an instance, don't think suspending or hibernating matter and couldn't fix fuse anyway? This patch has nothing to do with suspend or hibernate. What this patchset does, is help get rid of fusermount, a suid-root mount helper. It also opens up new possibilities, which are not fuse related. That's what I thought. So what was Pavel talking about with kill -9 no longer works and suspend no longer works above? I couldn't understand it from the context. Fuse has bad interactions with the freezer, theoretically. In practice, I remember just one bug report (that sparked off this whole do we need freezer, or don't we flamefest), that actually got fixed fairly quickly, ...maybe. Rafael probably remembers better. I think they just gave up and considered it unsolvable. I'm not sure it is. The 'kill -9' thing is basically due to VFS level locking not being interruptible. It could be changed, but I'm not sure it's worth it. For the suspend issue, there are also no easy solutions. What are the non-easy solutions? The ability to freeze tasks in uninterruptible sleep, or more generally at any preempt point (except when drivers are poking hardware). Couldn't some sort of scheduler based solution deal with the uninterruptible sleeping case? I know this doesn't play well with userspace hibernate, and I don't think it can be resolved without going the kexec way. I can see the desirability of kexec when it comes to avoiding the freezer, but comes with its own problems too - having the original context usable is handy, not having to set aside a large amount of space for a second kernel is also desirable and there are still greater issues of transferring information backwards and forwards between the two kernels. Regards, Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3 -mm] kexec jump -v8
Hi. Huang, Ying wrote: This patchset provides an enhancement to kexec/kdump. It implements the following features: - Backup/restore memory used both by the original kernel and the kexeced kernel. Why the kexeced kernel as well? [...] The features of this patchset can be used as follow: - Kernel/system debug through making system snapshot. You can make system snapshot, jump back, do some thing and make another system snapshot. Are you somehow recording all the filesystem changes after the first snapshot? If not, this is pointless (you'll end up with filesystem corruption). [...] - Cooperative multi-kernel/system. With kexec jump, you can switch between several kernels/systems quickly without boot process except the first time. This appears like swap a whole kernel/system out/in. How is this useful to the end user? Regards, Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFT] Port 0x80 I/O speed
Rene Herman wrote: Good day. Would some people on x86 (both 32 and 64) be kind enough to compile and run the attached program? This is about testing how long I/O port access to port 0x80 takes. It measures in CPU cycles so CPU speed is crucial in reporting. Posted a previous incarnation of this before, buried in the outb 0x80 thread which had a serialising problem. This one should as far as I can see measure the right thing though. Please yell if you disagree... For me, on a Duron 1300 (AMD756 chipset) I have a constant: [EMAIL PROTECTED]:~/src/port80$ su -c ./port80 cycles: out 2400, in 2400 and on a PII 400 (Intel 440BX chipset) a constant: [EMAIL PROTECTED]:~/src/port80$ su -c ./port80 cycles: out 553, in 251 Results are (mostly) independent of compiler optimisation, but testing with an -O2 compile should be most useful. Thanks! (AMD 1.8GHz Turion, running at 800MHz. ATI RS480 - Mitac 8350 mobo) [EMAIL PROTECTED]:~/Downloads$ gcc port80.c -o port80 [EMAIL PROTECTED]:~/Downloads$ sudo ./port80 cycles: out 1235, in 1207 [EMAIL PROTECTED]:~/Downloads$ sudo ./port80 cycles: out 1238, in 1205 [EMAIL PROTECTED]:~/Downloads$ sudo ./port80 cycles: out 1237, in 1209 [EMAIL PROTECTED]:~/Downloads$ gcc -O2 port80.c -o port80 [EMAIL PROTECTED]:~/Downloads$ sudo ./port80 cycles: out 1844674407370794, in 1844674407369408 [EMAIL PROTECTED]:~/Downloads$ sudo ./port80 cycles: out 1844674407370795, in 1844674407369404 [EMAIL PROTECTED]:~/Downloads$ sudo ./port80 cycles: out 1844674407370795, in 1844674407369409 [EMAIL PROTECTED]:~/Downloads$ sudo ./port80 cycles: out 1844674407370798, in 1844674407369407 [EMAIL PROTECTED]:~/Downloads$ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 36 model name : AMD Turion(tm) 64 Mobile Technology ML-34 stepping: 2 cpu MHz : 800.000 cache size : 1024 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow rep_good pni lahf_lm bogomips: 1592.87 TLB size: 1024 4K pages clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp tm stc Regards, Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFT] Port 0x80 I/O speed
Rene Herman wrote: On 12-12-07 00:55, Nigel Cunningham wrote: (AMD 1.8GHz Turion, running at 800MHz. ATI RS480 - Mitac 8350 mobo) [EMAIL PROTECTED]:~/Downloads$ gcc port80.c -o port80 [EMAIL PROTECTED]:~/Downloads$ sudo ./port80 cycles: out 1235, in 1207 Looking good. [EMAIL PROTECTED]:~/Downloads$ gcc -O2 port80.c -o port80 [EMAIL PROTECTED]:~/Downloads$ sudo ./port80 cycles: out 1844674407370794, in 1844674407369408 Obviously not. I suppose this changes with -m32 on the GCC command line? (sorry for missing that, I have no 64-bit machines). Yes, it does: [EMAIL PROTECTED]:~/Downloads$ gcc -m32 -o port80 port80.c [EMAIL PROTECTED]:~/Downloads$ sudo ./port80 cycles: out 1231, in 1208 [EMAIL PROTECTED]:~/Downloads$ sudo ./port80 cycles: out 1233, in 1210 Incidentally: [EMAIL PROTECTED]:~/Downloads$ processor_speed (A little script I made because my lappy does a solid lock every now and then that seems to be cpu-freq related - locking it to one frequency makes the lock far less common). Speed is now 180. [EMAIL PROTECTED]:~/Downloads$ sudo ./port80 cycles: out 2472, in 2505 [EMAIL PROTECTED]:~/Downloads$ sudo ./port80 cycles: out 2489, in 2515 [EMAIL PROTECTED]:~/Downloads$ sudo ./port80 cycles: out 2481, in 2503 [EMAIL PROTECTED]:~/Downloads$ sudo ./port80 cycles: out 2476, in 2507 So the same effect Maxim reported is seen here. Regards, Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Oops in evdev_disconnect for kernel 2.6.23.12
Hi Berthold. Berthold Cogel wrote: Jan 1 17:34:39 wonderland kernel: usb 2-2: USB disconnect, address 3 Jan 1 17:34:39 wonderland kernel: usb 2-2.5: USB disconnect, address 4 Jan 1 17:34:39 wonderland kernel: drivers/input/tablet/wacom_sys.c: wacom_sys_irq - usb_submit_urb failed with result -19 Jan 1 17:34:39 wonderland kernel: usb 2-2.6: USB disconnect, address 5 Jan 1 17:34:39 wonderland kernel: BUG: unable to handle kernel paging request at virtual address 00100100 Jan 1 17:34:39 wonderland kernel: printing eip: Jan 1 17:34:39 wonderland kernel: f8819668 Jan 1 17:34:39 wonderland kernel: *pde = Jan 1 17:34:39 wonderland kernel: Oops: [#1] Jan 1 17:34:39 wonderland kernel: PREEMPT Jan 1 17:34:39 wonderland kernel: Modules linked in: isofs nls_iso8859_1 nls_cp437 vfat fat radeon drm rfcomm l2cap bluetooth ppdev lp fan ac battery joydev dm_crypt wacom dm_snapshot dm_mirror sr_mod sd_mod sbp2 usbhid hid ff_memless usb_storage snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_emu10k1 firmware_class snd_ac97_codec ac97_bus snd_util_mem snd_hwdep snd_pcm_oss snd_pcm snd_page_alloc snd_mixer_oss snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device parport_pc parport rtc i2c_viapro ohci1394 via_agp ide_cd agpgart snd ehci_hcd emu10k1_gp gameport 8139too soundcore thermal uhci_hcd ieee1394 processor button evdev Jan 1 17:34:39 wonderland kernel: CPU:0 Jan 1 17:34:39 wonderland kernel: EIP:0060:[f8819668]Not tainted VLI Jan 1 17:34:39 wonderland kernel: EFLAGS: 00010206 (2.6.23.12 #1) Jan 1 17:34:39 wonderland kernel: EIP is at evdev_disconnect+0x65/0x9e [evdev] Jan 1 17:34:39 wonderland kernel: eax: ebx: 000ffcf0 ecx: c1926760 edx: 0033 Jan 1 17:34:39 wonderland kernel: esi: f7415600 edi: f741564c ebp: f7415654 esp: c1967e68 Jan 1 17:34:39 wonderland kernel: ds: 007b es: 007b fs: gs: ss: 0068 Jan 1 17:34:39 wonderland kernel: Process khubd (pid: 136, ti=c1966000 task=c1926570 task.ti=c1966000) Jan 1 17:34:39 wonderland kernel: Stack: f7415800 f7402000 f7402758 f740276c f7b94458 c03454b2 c03c6eb6 Jan 1 17:34:39 wonderland kernel:f7bda054 c029178a f788f520 f7bda000 f9b3c608 f9b3a3ab f7bda000 f7bda000 Jan 1 17:34:39 wonderland kernel:f7bda01c c0337954 f7bda01c f9b3c638 c02fdb59 f7bda01c f7bda01c Jan 1 17:34:39 wonderland kernel: Call Trace: Jan 1 17:34:39 wonderland kernel: [c03454b2] input_unregister_device+0x6f/0xff Jan 1 17:34:39 wonderland kernel: [c03c6eb6] klist_release+0x27/0x30 Jan 1 17:34:39 wonderland kernel: [c029178a] kref_put+0x5f/0x6c Jan 1 17:34:39 wonderland kernel: [f9b3a3ab] wacom_disconnect+0x2b/0x66 [wacom] Jan 1 17:34:39 wonderland kernel: [c0337954] usb_unbind_interface+0x2d/0x6e Jan 1 17:34:39 wonderland kernel: [c02fdb59] __device_release_driver+0x6e/0x8b Jan 1 17:34:39 wonderland kernel: [c02fdeaf] device_release_driver+0x1d/0x32 Jan 1 17:34:39 wonderland kernel: [c02fd599] bus_remove_device+0x6a/0x7a Jan 1 17:34:39 wonderland kernel: [c02fbde3] device_del+0x1c3/0x234 Jan 1 17:34:39 wonderland kernel: [c033567f] usb_disable_device+0x5c/0xbb Jan 1 17:34:39 wonderland kernel: [c0331ff9] usb_disconnect+0x7e/0xe6 Jan 1 17:34:39 wonderland kernel: [c0331fea] usb_disconnect+0x6f/0xe6 Jan 1 17:34:39 wonderland kernel: [c03324db] hub_thread+0x31c/0xa10 Jan 1 17:34:39 wonderland kernel: [c0114e17] update_curr+0x102/0x12c Jan 1 17:34:39 wonderland kernel: [c0114a13] update_stats_wait_end+0x96/0xb9 Jan 1 17:34:39 wonderland kernel: [c01281c7] autoremove_wake_function+0x0/0x33 Jan 1 17:34:39 wonderland kernel: [c03321bf] hub_thread+0x0/0xa10 Jan 1 17:34:39 wonderland kernel: [c012810e] kthread+0x36/0x5c Jan 1 17:34:39 wonderland kernel: [c01280d8] kthread+0x0/0x5c Jan 1 17:34:39 wonderland kernel: [c01048f7] kernel_thread_helper+0x7/0x10 Jan 1 17:34:39 wonderland kernel: === Jan 1 17:34:39 wonderland kernel: Code: 5e 4c 81 eb 10 04 00 00 eb 21 8d 83 08 04 00 00 b9 06 00 02 00 ba 1d 00 00 00 e8 6a 93 95 c7 8b 9b 10 04 00 00 81 eb 10 04 00 00 8b 83 10 04 00 00 0f 18 00 90 8d 83 10 04 00 00 39 f8 75 cb 8d Jan 1 17:34:39 wonderland kernel: EIP: [f8819668] evdev_disconnect+0x65/0x9e [evdev] SS:ESP 0068:c1967e68 I'm using Debian stable/testing/unstable with homemade kernel 2.6.23.12 (patched with tuxonice-3.0-rc3-for-2.6.23.9). I tried to get my Wacom Bamboo grafic tablet to work with linux and the xorg driver from linuxwacom-0.7.9-4 (http://linuxwacom.sourceforge.net/). After 'configure/make/make install' from source and configuring Xorg, I got the tablet working for a simple user. But each time I tried to login with X as root (I know Bad idea :-)) xserver got restarted. I tried to trace the situation with stracing gdm. I did this via an ssh
What's in store for 2008 for TuxOnIce?
Hi all. With the start of a new year, I suppose it's a good time to think about what I'd like to do with TuxOnIce this year and see what feedback I get. First up, I'm thinking about closing the mailing lists and asking people to use LKML instead for reporting issues and so on. I'm thinking about this because it will help with allowing people who work on mainline to see how stable (or otherwise!) TuxOnIce is now. It should also help when (as often happens) bug reports aren't actually issues with the patch, but with vanilla (ie drivers). Perhaps it will also help with whatever effort I find time to make towards convincing Andrew that it really does have significant advantages over [u]swsusp and kexec based hibernation. Secondly, I'm planning on moving the website soonish. It's taken longer than I planned because it will be sharing with another server I'm maintaining, and it has taken longer than expected to find good hosting for the other server (which was done first). Now that I'm happy with the other server's state, I'm hoping to start shifting suspend2.net/tuxonice.net soon. For those who might be looking for hosting themselves, I'm using slicehost. I initially tried GoDaddy, but had terrible service, problems with draconian limits on the volume of outgoing email (1000/day by default - useless if you're doing mailing lists) and unexpected, unexplained delays in mail delivery through the SMTP delay they force you to use. Slicehost, on the other hand, are terrific to deal with in everyway. If you sign up with them because of this email, please consider putting my email (nigel at suspend2.net) as the referrer - I then get a discount on the cost of the hosting. Third, regarding the patch itself, I'm taking my time in working towards the 3.0 release. We don't have any major bugs with 3.0-rc3 reported, but I have some things I want to complete before the final release: * see it well tested; * get a finished initial version of the cluster support; * finish completing support for the new resume-from-other kernels functionality that Rafael has added in 2.6.24. (We can resume from the same kernel at the moment, but I need to convince myself that nosave data is properly handled). Regards, Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Freezing filesystems (Was Re: What's in store for 2008 for TuxOnIce?)
Hi. Rafael J. Wysocki wrote: On Tuesday, 1 of January 2008, Nigel Cunningham wrote: Hi all. Hi Nigel, Gidday :) With the start of a new year, I suppose it's a good time to think about what I'd like to do with TuxOnIce this year and see what feedback I get. First up, I'm thinking about closing the mailing lists and asking people to use LKML instead for reporting issues and so on. I'm thinking about this because it will help with allowing people who work on mainline to see how stable (or otherwise!) TuxOnIce is now. It should also help when (as often happens) bug reports aren't actually issues with the patch, but with vanilla (ie drivers). I would also like the TuxOnIce issues related to drivers, ACPI, etc. to go to one of the kernel-related lists, but I think linux-pm may be better for that due to the much lower traffic. I guess that makes sense. I guess people can always be referred to LKML for the issues where the appropriate person isn't on linux-pm. Perhaps it will also help with whatever effort I find time to make towards convincing Andrew that it really does have significant advantages over [u]swsusp and kexec based hibernation. Secondly, I'm planning on moving the website soonish. It's taken longer than I planned because it will be sharing with another server I'm maintaining, and it has taken longer than expected to find good hosting for the other server (which was done first). Now that I'm happy with the other server's state, I'm hoping to start shifting suspend2.net/tuxonice.net soon. For those who might be looking for hosting themselves, I'm using slicehost. I initially tried GoDaddy, but had terrible service, problems with draconian limits on the volume of outgoing email (1000/day by default - useless if you're doing mailing lists) and unexpected, unexplained delays in mail delivery through the SMTP delay they force you to use. Slicehost, on the other hand, are terrific to deal with in everyway. If you sign up with them because of this email, please consider putting my email (nigel at suspend2.net) as the referrer - I then get a discount on the cost of the hosting. Third, regarding the patch itself, I'm taking my time in working towards the 3.0 release. We don't have any major bugs with 3.0-rc3 reported, but I have some things I want to complete before the final release: * see it well tested; * get a finished initial version of the cluster support; * finish completing support for the new resume-from-other kernels functionality that Rafael has added in 2.6.24. (We can resume from the same kernel at the moment, but I need to convince myself that nosave data is properly handled). Have you finished the support for freezing filesystems before freezing tasks that we talked about some time ago? Hmm. I've had too many things going through my little brain since then. What I currently have is support for freezing fuse filesystems separately. It looks like: int freeze_processes(void) { int error; printk(Stopping fuse filesystems.\n); freeze_filesystems(FS_FREEZER_FUSE); freezer_state = FREEZER_FILESYSTEMS_FROZEN; printk(Freezing user space processes ... ); error = try_to_freeze_tasks(FREEZER_USER_SPACE); if (error) goto Exit; printk(done.\n); sys_sync(); printk(Stopping normal filesystems.\n); freeze_filesystems(FS_FREEZER_NORMAL); freezer_state = FREEZER_USERSPACE_FROZEN; printk(Freezing remaining freezable tasks ... ); error = try_to_freeze_tasks(FREEZER_KERNEL_THREADS); if (error) goto Exit; printk(done.); freezer_state = FREEZER_FULLY_ON; Exit: BUG_ON(in_atomic()); printk(\n); return error; } (I'm not yet worrying about ext3 on fuse or such like, but it shouldn't be hard to extend the model to do that). Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Suspend2-devel] Reboot problem
Hi Christian. Christian Hesse wrote: On Tuesday 01 January 2008, Nigel Cunningham wrote: Third, regarding the patch itself, I'm taking my time in working towards the 3.0 release. We don't have any major bugs with 3.0-rc3 reported [...]. Well, I think I still have a bug, though it is possibly a mainline problem and it's not a showstopper. After a suspend/resume cycle the reboot does not work. The system hangs with Rebooting system (or similar). After that you have to hard reset the system, which is not really a problem as filesystems have been unmounted before. Reboot without a suspend cycle before and halt with and without suspend cycle work without problems. Just to clarify, do you mean rebooting after writing an image, or shutting down and rebooting? It could be that there's some change to the semantics in 2.6.24 that I haven't noticed yet. I'm using toi 3.0-rc3 with kernel 2.6.24-rc6 and beside the problem described above I'm really happy with toi. Happy new your to everybody! And to you too! Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Suspend2-devel] Freezing filesystems (Was Re: What's in store for 2008 for TuxOnIce?)
Hi Ted. Theodore Tso wrote: On Wed, Jan 02, 2008 at 10:54:18AM +1100, Nigel Cunningham wrote: I would also like the TuxOnIce issues related to drivers, ACPI, etc. to go to one of the kernel-related lists, but I think linux-pm may be better for that due to the much lower traffic. I guess that makes sense. I guess people can always be referred to LKML for the issues where the appropriate person isn't on linux-pm. Hi Nigel, I'd really recommend pushing the TuxOnIce discussions to LKML. That way people can see the size of the user community and Andrew and Linus can see how many people are using TuxOnIce. They can also see how well the TuxOnIce community helps address user problems, which is a big consideration when Linus decides whether or not to merge a particular technology. If the goal is eventual merger of TuxOnIce, LKML is really the best place to have the discussions. Examples such as Realtime, CFS, and others have shown that you really want to keep the discussion front and center. When one developer says, not my problem; my code is perfect, and the other developer is working with users who report problems, guess which technology generally ends up getting merged by Linus? Yes. The goal is eventual merger. That's what I was thinking too. Thanks for the input! Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Suspend2-devel] Freezing filesystems (Was Re: What's in store for 2008 for TuxOnIce?)
Hi. Rafael J. Wysocki wrote: On Wednesday, 2 of January 2008, Theodore Tso wrote: On Wed, Jan 02, 2008 at 10:54:18AM +1100, Nigel Cunningham wrote: I would also like the TuxOnIce issues related to drivers, ACPI, etc. to go to one of the kernel-related lists, but I think linux-pm may be better for that due to the much lower traffic. I guess that makes sense. I guess people can always be referred to LKML for the issues where the appropriate person isn't on linux-pm. Hi Nigel, I'd really recommend pushing the TuxOnIce discussions to LKML. CCing linux-pm (or even linux-acpi) on problem reports would still be recommended, though. :-) Right. And that may make things easier as far as TuxOnIce users go too. I have one user who currently subscribes to suspend2-users who already tried subscribing to LKML and said he didn't like the experience. Using linux-pm instead would save some pain there. Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: freeze vs freezer
Hi. Pavel Machek wrote: Hi! So how do you handle threads that are blocked on I/O or a lock during the system freeze process, then? We wait until they can continue. So if I have a process blocked on an unavilable NFS mount, I can't suspend? That's correct, you can't. [And I know what you're going to say. ;-)] Why exactly does suspend/hibernation depend on TASK_INTERRUPTIBLE instead of a zero preempt_count()? Really what we should do is just iterate over all of the actual physical devices and tell each one Block new IO requests preemptably, finish pending DMA, put the hardware in low-power mode, and prepare for suspend/hibernate. As long as each driver knows how to do those simple things we can have an entirely consistent kernel image for both suspend and for hibernation. each driver means this is a lot of work. But yes, that is probably way to go, and patch would be welcome. Yes, that does work. It's what I've done in my (preliminary) support for fuse. Regards, Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Suspend2-users] [Suspend2-devel] Freezing filesystems (Was Re: What's in store for 2008 for TuxOnIce?)
Hi Martin. Martin Steigerwald wrote: Am Mittwoch 02 Januar 2008 schrieb Nigel Cunningham: Hi. Hi, Rafael J. Wysocki wrote: On Wednesday, 2 of January 2008, Theodore Tso wrote: On Wed, Jan 02, 2008 at 10:54:18AM +1100, Nigel Cunningham wrote: I would also like the TuxOnIce issues related to drivers, ACPI, etc. to go to one of the kernel-related lists, but I think linux-pm may be better for that due to the much lower traffic. I guess that makes sense. I guess people can always be referred to LKML for the issues where the appropriate person isn't on linux-pm. Hi Nigel, I'd really recommend pushing the TuxOnIce discussions to LKML. CCing linux-pm (or even linux-acpi) on problem reports would still be recommended, though. :-) Right. And that may make things easier as far as TuxOnIce users go too. I have one user who currently subscribes to suspend2-users who already tried subscribing to LKML and said he didn't like the experience. Using linux-pm instead would save some pain there. I am a bit reluctant about LKML from some of the discussions I have seen there and participated in during CFS / CK discussion. I really didn't like the tone. Its one thing to say ones own oppinion, another one to bash at each other as if there was no tomorrow. This has been refreshingly different on tuxonice mailing lists. I am also a bit reluctant about the traffic. I already have some quite high traffic mailinglists with 3-4 mails a year, but LKML would top these easily I guess and I am not that sure I want to put that load on my mail infrastructure to follow TuxOnIce developments. I think this is a generic problem for testers of specific kernel subsystems... But then LKML is were TuxOnIce is visible to the kernel developer community. I would appreciate linux-pm I think maybe with a guideline to CC to LKML in usual cases... Thanks for your feedback. I think that's the way to go. BTW: toi-3.0-rc3 is rocking along nicely on my two ThinkPads (T42 and T23)... I am using 2.6.23.12 with cfs-v24.1... Great to hear! Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: freeze vs freezer
Hi. Rafael J. Wysocki wrote: On Wednesday, 2 of January 2008, Nigel Cunningham wrote: Pavel Machek wrote: So how do you handle threads that are blocked on I/O or a lock during the system freeze process, then? We wait until they can continue. So if I have a process blocked on an unavilable NFS mount, I can't suspend? That's correct, you can't. [And I know what you're going to say. ;-)] Why exactly does suspend/hibernation depend on TASK_INTERRUPTIBLE instead of a zero preempt_count()? Really what we should do is just iterate over all of the actual physical devices and tell each one Block new IO requests preemptably, finish pending DMA, put the hardware in low-power mode, and prepare for suspend/hibernate. As long as each driver knows how to do those simple things we can have an entirely consistent kernel image for both suspend and for hibernation. each driver means this is a lot of work. But yes, that is probably way to go, and patch would be welcome. Yes, that does work. It's what I've done in my (preliminary) support for fuse. Hmm, can you please elaborate a bit? Sorry. I wasn't very unambiguous, was I? And I'm not sure now whether you're meaning How does fuse support relate to freezing block devices? or What's this about fuse support?. Let me therefore seek to answer both questions: Higher level, I know (filesystems rather than block devices), but I was meaning the general concept of blocking new requests and completing existing ones worked fine for the supposedly impossible fuse support. Re fuse support, let me start by saying I know this doesn't handle all situations, but I think it's a good enough proof-of-concept implementation. I added some simple hooks to the code for submitting new work to fuse threads. #define FUSE_MIGHT_FREEZE(superblock, desc) \ do { \ int printed = 0; \ while(superblock-s_frozen != SB_UNFROZEN) { \ if (!printed) { \ printk(%d frozen in desc .\n, current-pid); \ printed = 1; \ } \ try_to_freeze(); \ yield(); \ } \ } while (0) On top of this, I made a (too simple at the moment) freeze_filesystems function which iterates through super_blocks in reverse order, freezing fuse filesystems or ordinary ones. I say 'too simple' because it doesn't currently allow for the possibility of someone mounting (say) ext3 on fuse, but that would just be an extension of what's already done. The end result is: int freeze_processes(void) { int error; printk(KERN_INFO Stopping fuse filesystems.\n); freeze_filesystems(FS_FREEZER_FUSE); freezer_state = FREEZER_FILESYSTEMS_FROZEN; printk(KERN_INFO Freezing user space processes ... ); error = try_to_freeze_tasks(FREEZER_USER_SPACE); if (error) goto Exit; printk(KERN_INFO done.\n); sys_sync(); printk(KERN_INFO Stopping normal filesystems.\n); freeze_filesystems(FS_FREEZER_NORMAL); freezer_state = FREEZER_USERSPACE_FROZEN; printk(KERN_INFO Freezing remaining freezable tasks ... ); error = try_to_freeze_tasks(FREEZER_KERNEL_THREADS); if (error) goto Exit; printk(KERN_INFO done.); freezer_state = FREEZER_FULLY_ON; Exit: BUG_ON(in_atomic()); printk(\n); return error; } Sorry if that's more info than you wanted. Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: freeze vs freezer
Hi. Oliver Neukum wrote: Am Donnerstag 03 Januar 2008 schrieb Nigel Cunningham: On top of this, I made a (too simple at the moment) freeze_filesystems function which iterates through super_blocks in reverse order, freezing fuse filesystems or ordinary ones. I say 'too simple' because it doesn't currently allow for the possibility of someone mounting (say) ext3 on fuse, but that would just be an extension of what's already done. How do you deal with fuse server tasks using other fuse filesystems? Since they're frozen in reverse order, the dependant one would be frozen first. How does freeze_filesystems() look? Removing my ugly debugging statements, it's currently: /** * freeze_filesystems - lock all filesystems and force them into a consistent * state */ void freeze_filesystems(int which) { struct super_block *sb; lockdep_off(); /* * Freeze in reverse order so filesystems dependant upon others are * frozen in the right order (eg. loopback on ext3). */ list_for_each_entry_reverse(sb, super_blocks, s_list) { if (sb-s_type-fs_flags FS_IS_FUSE sb-s_frozen == SB_UNFROZEN which FS_FREEZER_FUSE) { sb-s_frozen = SB_FREEZE_TRANS; sb-s_flags |= MS_FROZEN; continue; } if (!sb-s_root || !sb-s_bdev || (sb-s_frozen == SB_FREEZE_TRANS) || (sb-s_flags MS_RDONLY) || (sb-s_flags MS_FROZEN) || !(which FS_FREEZER_NORMAL)) continue; freeze_bdev(sb-s_bdev); sb-s_flags |= MS_FROZEN; } lockdep_on(); } Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: freeze vs freezer
Hi. Oliver Neukum wrote: Am Donnerstag, 3. Januar 2008 10:52:53 schrieb Nigel Cunningham: Hi. Oliver Neukum wrote: Am Donnerstag 03 Januar 2008 schrieb Nigel Cunningham: On top of this, I made a (too simple at the moment) freeze_filesystems function which iterates through super_blocks in reverse order, freezing fuse filesystems or ordinary ones. I say 'too simple' because it doesn't currently allow for the possibility of someone mounting (say) ext3 on fuse, but that would just be an extension of what's already done. How do you deal with fuse server tasks using other fuse filesystems? Since they're frozen in reverse order, the dependant one would be frozen first. Say I do: a) mount fuse on /tmp/first b) mount fuse on /tmp/second Then the server task for (a) does ls /tmp/second. So it will be frozen, right? How do you then freeze (a)? And keep in mind that the server task may have forked. I guess I should first ask, is this a real life problem or a hypothetical twisted web? I don't see why you would want to make two filesystems interdependent - it sounds like the way to create livelock and deadlocks in normal use, before we even begin to think about hibernating. Regards, Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: freeze vs freezer
Hi. Pavel Machek wrote: On Fri 2008-01-04 21:54:06, Oliver Neukum wrote: Am Donnerstag, 3. Januar 2008 23:06:07 schrieb Nigel Cunningham: Oliver Neukum wrote: Am Donnerstag, 3. Januar 2008 10:52:53 schrieb Nigel Cunningham: Oliver Neukum wrote: Am Donnerstag 03 Januar 2008 schrieb Nigel Cunningham: On top of this, I made a (too simple at the moment) freeze_filesystems function which iterates through super_blocks in reverse order, freezing fuse filesystems or ordinary ones. I say 'too simple' because it doesn't currently allow for the possibility of someone mounting (say) ext3 on fuse, but that would just be an extension of what's already done. How do you deal with fuse server tasks using other fuse filesystems? Since they're frozen in reverse order, the dependant one would be frozen first. Say I do: a) mount fuse on /tmp/first b) mount fuse on /tmp/second Then the server task for (a) does ls /tmp/second. So it will be frozen, right? How do you then freeze (a)? And keep in mind that the server task may have forked. I guess I should first ask, is this a real life problem or a hypothetical twisted web? I don't see why you would want to make two filesystems interdependent - it sounds like the way to create livelock and deadlocks in normal use, before we even begin to think about hibernating. Good questions. I personally don't use fuse, but I do care about power management. The problem I see is that an unprivileged user could make that dependency, even inadvertedly. Other problem is that unprivileged user can do it with evil intent. So called denial-of-service attack. Only in this case it would be a denial-of-denial-of-service attack, since it would stop you hibernating or suspending :). This is still all hypothetical. If I could have a real life case where this could actually happen, it would help a lot. Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Oops in evdev_disconnect for kernel 2.6.23.12
Hi. Berthold Cogel wrote: Al Viro schrieb: On Tue, Jan 01, 2008 at 08:26:05PM +0100, Berthold Cogel wrote: Jan 1 17:34:39 wonderland kernel: BUG: unable to handle kernel paging request at virtual address 00100100 LIST_POISON1 Jan 1 17:34:39 wonderland kernel: EIP is at evdev_disconnect+0x65/0x9e and by the look of code, it's a bit before the call of something that gets 0x20006 as one of its arguments. Which, by the look of evdev.s, gets passed only to kill_fasync(). So it's POLL_HUP, so this code could be these days: spin_lock(evdev-client_lock); list_for_each_entry(client, evdev-client_list, node) kill_fasync(client-fasync, SIGIO, POLL_HUP); spin_unlock(evdev-client_lock); in evdev_hangup() prior to commit 6addb1d6de1968b84852f54561cc9a09b5a9: list_for_each_entry(client, evdev-client_list, node) kill_fasync(client-fasync, SIGIO, POLL_HUP); in evdev_disconnect() I'm using Debian stable/testing/unstable with homemade kernel 2.6.23.12 (patched with tuxonice-3.0-rc3-for-2.6.23.9). ... and seeing that this changeset postdates 2.6.23 *and* adds locking to the lists we are traversing in either variant, I'd bet that the kernel you have does *NOT* have the changeset in question, that you have list corruption from race and that your oops is list_for_each_entry() trying to walk forward from entry that just had list_del() poisoning its -next. There are only 4 changesets between 2.6.23 and this one affecting drivers/input and only 8006479c9b75fb6594a7b746af3d7f1fbb68f18f and 6addb1d6de1968b84852f54561cc9a09b5a9 appear to be relevant. Apply to your kernel and see if it helps... Looks as if I have to start using git ... I always feared that this day will come. ;-) If I'm able to reproduce the oops with my patched kernel, I will gladly follow your advice. Regards, Berthold I can't do it immediately but I'll send you the patches to try a later in the day if you like. Nigel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH -mm] Freezer: Do not allow freezing processes to clear TIF_SIGPENDING
Hi. On Friday 19 October 2007 08:22:35 Rafael J. Wysocki wrote: From: Rafael J. Wysocki [EMAIL PROTECTED] Do not allow processes to clear their TIF_SIGPENDING if TIF_FREEZE is set, to prevent them from racing with the freezer (like mysqld does, for example). Signed-off-by: Rafael J. Wysocki [EMAIL PROTECTED] Acked-by: Nigel Cunningham [EMAIL PROTECTED] --- kernel/signal.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6.23-mm1/kernel/signal.c === --- linux-2.6.23-mm1.orig/kernel/signal.c +++ linux-2.6.23-mm1/kernel/signal.c @@ -124,7 +124,7 @@ void recalc_sigpending_and_wake(struct t void recalc_sigpending(void) { - if (!recalc_sigpending_tsk(current)) + if (!recalc_sigpending_tsk(current) !freezing(current)) clear_thread_flag(TIF_SIGPENDING); } -- Nigel, Michelle, Alisdair and Cunningham 5 Mitchell Street Cobden 3266 Victoria, Australia - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump
Hi Andrew. On Thursday 20 September 2007 20:09:41 Pavel Machek wrote: Seems like good enough for -mm to me. Pavel Andrew, if I recall correctly, you said a while ago that you didn't want another hibernation implementation in the vanilla kernel. If you're going to consider merging this kexec code, will you also please consider merging TuxOnIce? Regards, Nigel -- See http://www.tuxonice.net for Howtos, FAQs, mailing lists, wiki and bugzilla info. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump
Hi. On Friday 21 September 2007 11:06:23 Andrew Morton wrote: On Fri, 21 Sep 2007 10:24:34 +1000 Nigel Cunningham [EMAIL PROTECTED] wrote: Hi Andrew. On Thursday 20 September 2007 20:09:41 Pavel Machek wrote: Seems like good enough for -mm to me. Pavel Andrew, if I recall correctly, you said a while ago that you didn't want another hibernation implementation in the vanilla kernel. If you're going to consider merging this kexec code, will you also please consider merging TuxOnIce? The theory is that kexec-based hibernation will mainly use preexisting kexec code and will permit us to delete the existing hibernation implementation. That's different from replacing it. TuxOnIce doesn't remove the existing implementation either. It can transparently replace it, but you can enable/disable that at compile time. Regards, Nigel -- Nigel Cunningham Christian Reformed Church of Cobden 103 Curdie Street, Cobden 3266, Victoria, Australia Ph. +61 3 5595 1185 / +61 417 100 574 Communal Worship: 11 am Sunday. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump
Hi. On Friday 21 September 2007 11:41:06 Andrew Morton wrote: On Friday 21 September 2007 11:06:23 Andrew Morton wrote: On Fri, 21 Sep 2007 10:24:34 +1000 Nigel Cunningham [EMAIL PROTECTED] wrote: Hi Andrew. On Thursday 20 September 2007 20:09:41 Pavel Machek wrote: Seems like good enough for -mm to me. Pavel Andrew, if I recall correctly, you said a while ago that you didn't want another hibernation implementation in the vanilla kernel. If you're going to consider merging this kexec code, will you also please consider merging TuxOnIce? The theory is that kexec-based hibernation will mainly use preexisting kexec code and will permit us to delete the existing hibernation implementation. That's different from replacing it. TuxOnIce doesn't remove the existing implementation either. It can transparently replace it, but you can enable/disable that at compile time. Right. So we end up with two implementations in-tree. Whereas kexec-based-hibernation leads us to having zero implementations in-tree. See, it's different. That's not true. Kexec will itself be an implementation, otherwise you'd end up with people screaming about no hibernation support. And it won't result in the complete removal of the existing hibernation code from the kernel. At the very least, it's going to want the kernel being hibernated to have an interface by which it can find out which pages need to be saved. I wouldn't be surprised if it also ends up with an interface in which the kernel being hibernated tells it what bdev/sectors in which to save the image as well (otherwise you're going to need a dedicated, otherwise untouched partition exclusively for the kexec'd kernel to use), or what network settings to use if it wants to try to save the image to a network storage device. On top of that, there are all the issues related to device reinitialisation and so on, and it looks like there's greatly increased pain for users wanting to configure this new implementation. Kexec is by no means proven to be the panacea for all the issues. Regards, Nigel -- Nigel Cunningham Pastor Christian Reformed Church of Cobden Victoria, Australia +61 3 5595 1185 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump
Hi. On Friday 21 September 2007 12:18:57 Huang, Ying wrote: That's not true. Kexec will itself be an implementation, otherwise you'd end up with people screaming about no hibernation support. And it won't result in the complete removal of the existing hibernation code from the kernel. At the very least, it's going to want the kernel being hibernated to have an interface by which it can find out which pages need to be saved. I wouldn't This has been done by kexec/kdump guys. There is a makedumpfile utility and vmcoreinfo kernel mechanism to implement this. We can just reuse the work of kexec/kdump. You've already said that you are currently saving all pages. How are you going to avoid saving free pages if you don't get the information from the kernel being saved? This will require more than just code reuse. be surprised if it also ends up with an interface in which the kernel being hibernated tells it what bdev/sectors in which to save the image as well (otherwise you're going to need a dedicated, otherwise untouched partition exclusively for the kexec'd kernel to use), or what network settings to use if it wants to try to save the image to a network storage device. On top of These can be done in user space. The image writing will be done in user space for kexec base hibernation. That only complicates things more. Now you need to get the information on where to save the image from the kernel being saved, then transfer it to userspace after switching to the kexec kernel. That's more kernel code, not less. that, there are all the issues related to device reinitialisation and so on, Yes. Device reinitialisation is needed. But all in all, kexec based hibernation can be much simpler on the kernel side. Sorry, but I'm yet to be convinced. I'm not unwilling, I'm just not there yet. and it looks like there's greatly increased pain for users wanting to configure this new implementation. Kexec is by no means proven to be the panacea for all the issues. Configuration is a problem, we will work on it. But, because it is based on kexec/kdump instead of starting from scratch, the duplicated part between hibernation and kexec/kdump can be eliminated. Regards, Nigel -- Nigel, Michelle and Alisdair Cunningham 5 Mitchell Street Cobden 3266 Victoria, Australia - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump
Hi. On Friday 21 September 2007 12:45:57 Huang, Ying wrote: On Fri, 2007-09-21 at 12:25 +1000, Nigel Cunningham wrote: Hi. On Friday 21 September 2007 12:18:57 Huang, Ying wrote: That's not true. Kexec will itself be an implementation, otherwise you'd end up with people screaming about no hibernation support. And it won't result in the complete removal of the existing hibernation code from the kernel. At the very least, it's going to want the kernel being hibernated to have an interface by which it can find out which pages need to be saved. I wouldn't This has been done by kexec/kdump guys. There is a makedumpfile utility and vmcoreinfo kernel mechanism to implement this. We can just reuse the work of kexec/kdump. You've already said that you are currently saving all pages. How are you going to avoid saving free pages if you don't get the information from the kernel being saved? This will require more than just code reuse. I have not tried makedumpfile. The makedumpfile avoids saving free pages through checking the mem_map of the original kernel. I think there is nothing prevent it been used for kexec based hibernation image writing. This is an example of duplicated effort between kexec/kdump and original hibernation implementation. Both kexec/kdump and hibernation need to save memory image without saving the free pages. This can be done once instead of twice. Ok. be surprised if it also ends up with an interface in which the kernel being hibernated tells it what bdev/sectors in which to save the image as well (otherwise you're going to need a dedicated, otherwise untouched partition exclusively for the kexec'd kernel to use), or what network settings to use if it wants to try to save the image to a network storage device. On top of These can be done in user space. The image writing will be done in user space for kexec base hibernation. That only complicates things more. Now you need to get the information on where to save the image from the kernel being saved, then transfer it to userspace after switching to the kexec kernel. That's more kernel code, not less. This is fairly simple in fact. For example, you can specify the bdev/sectors in kernel command line when do kexec load kexec -l ... --append='...', then the image writing system can get it through cat /proc/cmdline. Sounds doable, as long as you can cope with long command lines (which shouldn't be a biggie). (If you've got a swapfile or parts of a swap partition already in use, it can be quite fragmented). Andrew, you're seeing that it really doesn't mean the removal of all hibernation code from the kernel being suspended, aren't you? (And if the kexec'd kernel is the same binary, then there's more code again). Regards, Nigel -- See http://www.tuxonice.net for Howtos, FAQs, mailing lists, wiki and bugzilla info. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump
Hi. On Friday 21 September 2007 21:56:29 Rafael J. Wysocki wrote: [Besides, the current hibernation userland interface is used by default by openSUSE and it's also used by quite some Debian users, so we can't drop it overnight and it can't be implemented in a compatible way on top of the kexec-based solution.] Could it be fudged by giving userland a null image and having (say) the first ioctl be one that triggers all the real work (with other ioctls being noops or such like, as appropriate)? Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump
Hi. On Friday 21 September 2007 22:18:19 Rafael J. Wysocki wrote: On Friday, 21 September 2007 13:58, Nigel Cunningham wrote: Hi. On Friday 21 September 2007 21:56:29 Rafael J. Wysocki wrote: [Besides, the current hibernation userland interface is used by default by openSUSE and it's also used by quite some Debian users, so we can't drop it overnight and it can't be implemented in a compatible way on top of the kexec-based solution.] Could it be fudged by giving userland a null image and having (say) the first ioctl be one that triggers all the real work (with other ioctls being noops or such like, as appropriate)? Well, the suspend part is probably doable, but I'm afraid of the resume one. 'k. I've occasionally thought about trying it, but haven't ever gotten around to actually doing it yet. (I'd like to make TuxOnIce transparently replace both swsusp and uswsusp if I could). Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [linux-pm] Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump
Hi. On Saturday 22 September 2007 09:19:18 Kyle Moffett wrote: I think that in order for this to work, there would need to be some ABI whereby the resume-ing kernel can pass its entire ACPI state and a bunch of other ACPI-related device details to the resume-ed kernel, which I believe it does not do at the moment. I believe that what causes problems is the ACPI state data that the kernel stores is *different* between identical sequential boots, especially when you add/remove/replace batteries, AC, etc. That's certainly possible. We already pass a very small amount of data between the boot and resuming kernels at the moment, and it's done quite simply - by putting the variables we want to 'transfer' in a nosave page/section. I could conceive of a scheme wherein this was extended for driver data. Since the memory needed would depend on the drivers loaded, it would probably require that the space be allocated when hibernating, and the locations of structures be stored in the image header and then drivers notified of the locations to use when preparing to resume, but it could work... Since we currently throw away most of that in-kernel ACPI interpreter state data when we load the to-be-resumed image and replace it with the state from the previous boot it looks to the ACPI code and firmware like our system's hardware magically changed behind its back. The result is that the ACPI and firmware code is justifiably confused (although probably it should be more idempotent to begin with). There's 2 potential solutions: 1) Formalize and copy a *lot* of ACPI state from the resume-ing kernel to the resume-ed kernel. 2) Properly call the ACPI S4 methods in the proper order ... that said, I don't think the above should be necessary in most cases. I believe we're already calling the ACPI S4 methods in the proper order. If I understood correctly, Rafael put a lot of effort into learning what that was, and into ensuring it does get done. Neither one is particularly easy or particularly pleasant, especially given all the vendor bugs in this general area. Theoretically we should be able to do both, since one will be more reliable than the other on different systems depending on what kinds of firmware bugs they have. Regards, Nigel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump
Hi. On Thursday 27 September 2007 06:30:36 Joseph Fannin wrote: On Fri, Sep 21, 2007 at 11:45:12AM +0200, Pavel Machek wrote: Hi! Sounds doable, as long as you can cope with long command lines (which shouldn't be a biggie). (If you've got a swapfile or parts of a swap partition already in use, it can be quite fragmented). Hmm. This is an interesting problem. Sharing a swap file or a swap partition with the actual swap of user space pages does seem to be a limitation of this approach. Although the fact that it is simple to write to a separate file may be a reasonable compensation. I'm not sure how you'd write it to a separate file. Notice that kjump kernel may not mount journalling filesystems, not even read-only. (Ext3 replays journal in that case). You could pass block numbers from the original kernel... The ext3 thing is a bug, the case for which I don't think has been adequately explained to the ext[34] folks. There should be at least a no_replay mount flag available, or something. It has ramifications for more than just hibernation. And yeah, I'm gonna bring up the swap files thing again. If you can hibernate to a swap file, you can hibernate to a dedicated hibernation file, and vice versa. If you can't hibernate to a swap file, then swap files are effectively unsupported for any system you might want to hibernate. handwave I wonder what embedded folks would think about that /handwave. But, in my ignorance, I'm not sure even fixing the ext3 bug will guarantee you consistent metadata so that you can handle a swap/hibernate file. You can do a sync(), but how do you make that not race against running processes without the freezer, or blkdev snapshots? I guess uswsusp and the-patch-previously-known-as-suspend2 handle this somehow, though. (It's that same ignorance that has me waiting for someone with established credit with kernel people to make that argument for the ext3 bug, so I can hang my own reasons for thinking that it's bad off of theirs). I haven't looked at swsusp support, but TuxOnIce handles all storage (swap partitions, swap files and ordinary files) by first allocating swap (if we're using swap), then bmapping the storage we're going to use. After that, we can freeze filesystems and processes with impunity. The allocated storage is then viewed as just a collection of bdevs, each with an ordered chain of extents defining which blocks we're going to read/write - a series of tapes if you like. In the image header, we store dev_ts and the block chains, together with the configuration information. As long as the same bdevs are configured at boot time prior to the echo /sys/power/resume, we're in business. Filesystems don't need to be mounted because we don't use filesystem code anyway. (LVM etc does though in so far as it's needed to make the dev_t match the device again). This matches with what you said above about hibernating to swap files and dedicated hibernation files - TuxOnIce uses exactly the same code to do the i/o to both; the variation is in the code to recognise the image header and allocate/free/bmap storage. not a filesystem expert Personally, I don't think ext[34] is broken. If there's data being left in the journal that will need replaying, then mounting without replaying the journal sounds wrong. Perhaps you should instead be arguing that nothing should be left in the journal after a filesystem freeze. But, of course, current code isn't doing a filesystem freeze (just a process freeze) and the kexec guys want to take even that away. /not a filesystem expert In short, I agree. AFAICS, you need both the process freezer and filesystem freezing to make this thing fly properly. Nigel -- See http://www.tuxonice.net for Howtos, FAQs, mailing lists, wiki and bugzilla info. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump
Hi. On Thursday 27 September 2007 16:33:54 Huang, Ying wrote: On Wed, 2007-09-26 at 16:30 -0400, Joseph Fannin wrote: But, in my ignorance, I'm not sure even fixing the ext3 bug will guarantee you consistent metadata so that you can handle a swap/hibernate file. You can do a sync(), but how do you make that not race against running processes without the freezer, or blkdev snapshots? I guess uswsusp and the-patch-previously-known-as-suspend2 handle this somehow, though. The image-writing kernel of kexec based hibernation run in a controlled way. It is not used by normal user, so only really necessary process need to be run. For example, it is possible that there is only one user process -- the image-writing process running in image-writing kernel. So, no freezer or blkdev snapshot is needed. You're thinking of the wrong kernel - we were talking about prior to switching to the kexec'd kernel while suspending. Regards, Nigel -- See http://www.tuxonice.net for Howtos, FAQs, mailing lists, wiki and bugzilla info. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Fwd: [Suspend2-devel] [patch] 2.2.10.3 build fixes
Hi Rafael et al. This looks like it will be vanilla material, maybe 2.6.23 material? Regards, Nigel -- Forwarded Message -- Subject: [Suspend2-devel] [patch] 2.2.10.3 build fixes Date: Sunday 30 September 2007 From: Roman Dubtsov (dubtsov gmail com) Hi, I have recently run into build issue with 2.6.22 and tuxonice 2.2.10.3. When building custom kernel with make-kpkg the process failed with the message saying: fs.h requires linux/freezer.h, which does not exist in exported headers. Here's quick-n-dirty patch which fixes this. Hope it is usefull. --- 2.6.22-toi/include/linux/Kbuild.orig 2007-09-30 01:21:30.0 +0700 +++ 2.6.22-toi/include/linux/Kbuild 2007-09-29 23:52:52.0 +0700 @@ -202,6 +202,7 @@ unifdef-y += filter.h unifdef-y += flat.h unifdef-y += futex.h unifdef-y += fs.h +unifdef-y += freezer.h unifdef-y += gameport.h unifdef-y += generic_serial.h unifdef-y += genhd.h - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fwd: [Suspend2-devel] [patch] 2.2.10.3 build fixes
Hi. On Monday 01 October 2007 05:56:45 Rafael J. Wysocki wrote: Hi, On Sunday, 30 September 2007 13:44, Nigel Cunningham wrote: Hi Rafael et al. This looks like it will be vanilla material, maybe 2.6.23 material? Well, I wouldn't like to export freezer.h . Why exactly would that be necessary? A module that starts a freezeable kthread? I can ask for more details, and will if you like. Regards, Nigel -- See http://www.tuxonice.net for Howtos, FAQs, mailing lists, wiki and bugzilla info. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fwd: [Suspend2-devel] [patch] 2.2.10.3 build fixes
Hi. On Monday 01 October 2007 08:28:02 Rafael J. Wysocki wrote: On Sunday, 30 September 2007 23:43, Nigel Cunningham wrote: On Monday 01 October 2007 05:56:45 Rafael J. Wysocki wrote: On Sunday, 30 September 2007 13:44, Nigel Cunningham wrote: Hi Rafael et al. This looks like it will be vanilla material, maybe 2.6.23 material? Well, I wouldn't like to export freezer.h . Why exactly would that be necessary? A module that starts a freezeable kthread? I can ask for more details, and will if you like. Yes, please. Ah. My bad. I should have looked at it more carefully before forwarding; it's a result of my modifications for fuse support. Sorry for the noise. Nigel -- See http://www.tuxonice.net for Howtos, FAQs, mailing lists, wiki and bugzilla info. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/