Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Nigel Cunningham
Hi.

On Thu, 2007-04-19 at 00:22 +0200, Christian Hesse wrote:
 On Thursday 19 April 2007, Ingo Molnar wrote:
  * Christian Hesse [EMAIL PROTECTED] wrote:
although probably your suspend2 problem is still not fixed, it's
worth a try nevertheless. Which suspend2 patch did you apply, and
was it against -rc6 or -rc7?
  
   You are right again. ;-)
  
   Linux 2.6.21-rc7
   Suspend2 2.2.9.11 (applies cleanly to -rc7)
   CFS v3 (without any additional patches)
  
   And it still hangs on suspend.
 
  what's the easiest way for me to try suspend2? Apply the patch, reboot
  into the kernel, then execute what command to suspend? (there's a
  confusing mismash of initiators of all the suspend variants. Can i drive
  this by echoing to /sys/power/state?)
 
 Perhaps you have to install suspend2-userui as well for the output (I'm not 
 shure whether it works without). Then you can trigger the suspend by echoing 
 to /sys/power/suspend2/do_suspend.
 Useful informations can be found in the Howto:
 
 http://www.suspend2.net/HOWTO
 
 I dropped some ccs to not abuse Linus and friends.

You can suspend and resume without it.

Regards,

Nigel


signature.asc
Description: This is a digitally signed message part


Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Nigel Cunningham
Hi.

On Wed, 2007-04-18 at 18:56 -0400, Bob Picco wrote:
 Ingo Molnar wrote:[Wed Apr 18 2007, 06:02:28PM EDT]
  
  * Christian Hesse [EMAIL PROTECTED] wrote:
  
although probably your suspend2 problem is still not fixed, it's 
worth a try nevertheless. Which suspend2 patch did you apply, and 
was it against -rc6 or -rc7?
   
   You are right again. ;-)
   
   Linux 2.6.21-rc7
   Suspend2 2.2.9.11 (applies cleanly to -rc7)
   CFS v3 (without any additional patches)
   
   And it still hangs on suspend.
  
  what's the easiest way for me to try suspend2? Apply the patch, reboot 
  into the kernel, then execute what command to suspend? (there's a 
  confusing mismash of initiators of all the suspend variants. Can i drive 
  this by echoing to /sys/power/state?)
  
  Ingo
 I had hoped to collect more data with CFS V2. It crashes in
 scale_nice_down for s2ram when attempting to disable_nonboot_cpus. 
 So part of traceback looks like (typed by hand with obvious omissions):
 
 scale_nice_down
 update_stats_wait_end - not shown in traceback because inlined
 pick_next_task_fair
 migration_call
 task_rq_lock
 notifier_call_chain
 _cpu_down
 disable_nonboot_cpus
 ...
 
 This is standard -rc7 with V2 CFS applied. It could be a completely
 unrelated issue. I'll attempt to debug further tomorrow.

That - and Christian's other reply with the jpg - look to me more like
this is an interaction between CFS and cpu hotplugging than Suspend2
itself. Can you also reproduce this with swsusp?

Regards,

Nigel


signature.asc
Description: This is a digitally signed message part


Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Nigel Cunningham
Hi.

On Thu, 2007-04-19 at 00:02 +0200, Ingo Molnar wrote:
 * Christian Hesse [EMAIL PROTECTED] wrote:
 
   although probably your suspend2 problem is still not fixed, it's 
   worth a try nevertheless. Which suspend2 patch did you apply, and 
   was it against -rc6 or -rc7?
  
  You are right again. ;-)
  
  Linux 2.6.21-rc7
  Suspend2 2.2.9.11 (applies cleanly to -rc7)
  CFS v3 (without any additional patches)
  
  And it still hangs on suspend.
 
 what's the easiest way for me to try suspend2? Apply the patch, reboot 
 into the kernel, then execute what command to suspend? (there's a 
 confusing mismash of initiators of all the suspend variants. Can i drive 
 this by echoing to /sys/power/state?)

From subsequent emails, I think you already got your answer, but just in
case...

Yes, if you enabled Replace swsusp by default and you already had it
set up for getting swsusp to resume. If not, and you're using an
initrd/ramfs, you'll need to modify it to echo
 /sys/power/suspend2/do_resume after /sys and /proc are mounted but
prior to mounting / and so on.

Regards,

Nigel


signature.asc
Description: This is a digitally signed message part


Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy

2007-04-19 Thread Nigel Cunningham
Hi Ingo.

On Thu, 2007-04-19 at 09:04 +0200, Ingo Molnar wrote:
 * Nigel Cunningham [EMAIL PROTECTED] wrote:
 
  From subsequent emails, I think you already got your answer, but just 
  in case...
  
  Yes, if you enabled Replace swsusp by default and you already had it 
  set up for getting swsusp to resume. If not, and you're using an 
  initrd/ramfs, you'll need to modify it to echo
   /sys/power/suspend2/do_resume after /sys and /proc are mounted but
  prior to mounting / and so on.
 
 yeah, went with the default suggested by your patch:
 
CONFIG_SUSPEND2_REPLACE_SWSUSP=y
 
 and it was pretty easy to set things up. I used echo disk  
 /sys/power/state to trigger it.
 
 In hindsight it was all pretty straightforward and suspend2 worked 
 beautifully on an UP and on an SMP system i tried. So in exchange for 
 suspend2 folks debugging a bug in CFS here's some suspend2 review 
 feedback ;) Any plans about moving suspend2 to the upstream kernel? It 
 should be pretty easy for it to co-exist with the current swsuspend 
 code.

I really would like to get it into Linus' tree but Pavel doesn't want it
(obviously!) and I haven't got together enough of a case yet to convince
Andrew. I yet another here's-why-I-think-it-should-be-merged email in
the works (poor Andrew!) but there are too many other things on my plate
at the mo.

 The patch has quite some size:
 
  89 files changed, 16452 insertions(+), 69 deletions(-)
 
 that should obviously be split up into more than a dozen sub-patches, 
 and fed to lkml with the small ones first. (unless it already is split 
 up?)

Right. A good portion (~2000 lines) of that is documentation.

 i cannot comment on the kernel/power/ bits (they are way too large 
 anyway), other than that they look pretty clean visually, but the 
 lowlevel arch and generic kernel bits look sane in detail too, sans a 
 few mostly trivial cleanliness issues:
 
 +int suspend2_faulted = 0;
 +EXPORT_SYMBOL(suspend2_faulted);
 
 should be done via the pagefault notifier chain mechanism. Also, all the 
 exports you added should be EXPORT_SYMBOL_GPL().

I'll look at that, but I'm not sure if it's a good idea - this is for
during the atomic copy  restore, when DEBUG_PAGEALLOC is enabled on
x86. Other things might touch memory in ways we don't want. It's only
needed for slab pages that get unmapped but not freed.

As far as the module exports go, I'm not expecting them to get merged. I
like building Suspend2 as modules (it helps speed the development
cycle), and see it as potentially useful for embedded but IMO there are
too many export symbols to make merging that code a possibility. This is
why they're all in one file rather than sprinkled through the files that
define the symbols.

 this:
 
 -   ClearPageReserved(virt_to_page(addr));
 -   init_page_count(virt_to_page(addr));
 +   //ClearPageReserved(virt_to_page(addr));
 +   //init_page_count(virt_to_page(addr));
 
 looks like there's a buglet in there still somewhere?

Yeah. When I was recently debugging, I found that cpu hotplugging is
using something marked __init which is causing the machine to
spontaneously reboot when cpus are replugged if DEBUG_PAGEALLOC is
enabled. Haven't had the time to get back to it, and also need some help
with the approach (what makes the machine reboot in this case instead of
oopsing, and how do I stop it?).

 +   if(PageHighMem(page))
 +   return 0;
 
 coding style.

Oh. The space missing after the if? Ok.

 +   BUG_ON( test_suspend_state(SUSPEND_RUNNING)   /* Suspend2, that is 
 */
 
 make this a WARN_ON() or a WARN_ON_ONCE() - that way you have a chance 
 to even get feedback from users, instead of a 'uhm, X froze' report.
 
 +#define FREEZER_OFF 0
 +#define FREEZER_USERSPACE_FROZEN 1
 +#define FREEZER_FULLY_ON 2
 
 should be:
 
 +#define FREEZER_OFF  0
 +#define FREEZER_USERSPACE_FROZEN 1
 +#define FREEZER_FULLY_ON 2
 
 (you want your reviewers have an pleasant time reading your code :)

Ok.

 +#define NETLINK_SUSPEND2_USERUI20  /* For suspend2's userui */
 
 IIRC userui was at the center of suspend2 merge flames, right? So you 
 might want to layer it ontop a less flashy suspend2-core and thus get 
 90% of your patch upstream?

Ok. I've just separated that into it's own file/module, so that will be
straightforward to do.

 +++ linux/mm/vmscan.c
 
 the MM impact looks quite nontrivial. But i suspect this is unavoidable, 
 because you zap portions of the pagecache on the way to disk, so when it 
 comes back it results in a different pagecache (new lru lists, etc.), 
 right?

The modifications do three things.

First, we're seeking to keep the LRU static once while we're suspending.
I originally sought to achieve that by avoiding entering the vmscan.c
logic (not as drastic as it sounds - Suspend2 is the only thing
running!). I think it was Nick who said he'd rather see it the pages
unlinked and kept safe

Re: VMWare Workstation 6 for debugging Linux Kernel (!)

2007-04-20 Thread Nigel Cunningham
Hi.

On Fri, 2007-04-20 at 14:45 +0300, Avi Kivity wrote:
 Andi Kleen wrote:
  Xavier Bestel [EMAIL PROTECTED] writes:
 

  On Fri, 2007-04-20 at 00:46 +0200, roland wrote:
 
  
  We just quietly added an exciting feature to Workstation 6.0. I believe 
  it 
  will make WS6 a great tool for Linux kernel development. You can now 
  debug 
  kernel of Linux VM with gdb running on the Host without changing anything 
  in 
  the Guest VM. No kdb, no recompiling and no need for second machine. All 
  you 
  need is a single line in VM's configuration file.

  I think qemu has the exact same feature.
  
 
  It doesn't seem to work for x86-64 there though.

 
 kvm's qemu has a patch that allows qemu to be an x86_64 gdbserver (with
 or without kvm).

I was meaning that vmware wasn't working, but it is now - I was trying a
64 host and client, and needed to know both the different line in the
config file and the different port number.

Regards,

Nigel


signature.asc
Description: This is a digitally signed message part


Re: VMWare Workstation 6 for debugging Linux Kernel (!)

2007-04-20 Thread Nigel Cunningham
Hi.

On Fri, 2007-04-20 at 04:21 -0700, Petr Vandrovec wrote:
 Andi Kleen wrote:
  Xavier Bestel [EMAIL PROTECTED] writes:
  
  On Fri, 2007-04-20 at 00:46 +0200, roland wrote:
 
  We just quietly added an exciting feature to Workstation 6.0. I believe 
  it 
  will make WS6 a great tool for Linux kernel development. You can now 
  debug 
  kernel of Linux VM with gdb running on the Host without changing anything 
  in 
  the Guest VM. No kdb, no recompiling and no need for second machine. All 
  you 
  need is a single line in VM's configuration file.
  I think qemu has the exact same feature.
  
  It doesn't seem to work for x86-64 there though.
 
 Hello,
 
 Do you mean with qemu or with VMware?  Yes, we do not support replay 
 with 64bit guests, but debug interface should just work.  Only gotcha is 
 that for 64bit guest you need another option:
 
 debugStub.listen.guest64 = TRUE

Ah. That might help :)

 and then you need to attach gdb to port 8864 (*).  Unfortunately it does 
 not seem possible to build gdb which would support 16bit/32bit code 
 while using 64bit gdb on-wire format, so there are two interfaces.  And 
 if you single-step switch from 64bit mode to 32bit mode or back, you 
 also have to switch gdbs.  Yes, it is a bit unintuitive, and 
 additionally one gdb silently ignores breakpoints set up by other gdb, 
 so you need to keep breakpoints in sync between two gdbs yourself :-(
 
 (*) If you are using gdb which has both 32bit and 64bit support, be sure 
 to issue appropriate 'set architecture xxx' before 'target remote 
 localhost:88xx' (i386:x86-64 for port 8864, i386 or i8086 for port 
 8832).  Otherwise gdb is going to die complaining it could not parse 
 remote reply.

That too.

Thanks!

Nigel


signature.asc
Description: This is a digitally signed message part


Reasons to merge suspend2.

2007-04-24 Thread Nigel Cunningham
Hi all.

I've been working on this email on and off for a while, but since Pavel
raised the issue again, I thought I should make a concerted effort to
finish it...

In this email, I'm going to outline the problems with the current design
(uswsusp and swsusp) and the ways in which Suspend2 overcomes those
limitations, before going on to outline the additional advantages
Suspend2 has for users and address objections previously raised against
merging Suspend2.

A) Problems with the current design.


1) Ordering of operations.

The current [u]swsusp design doesn't do things in discrete, well ordered
stages. Storage for the image is not allocated until after the atomic
copy has been done. This means that the process can fail when we are a
significant portion of the way into suspending, and it means it can fail
when the user will seriously expect it to run to completion. The
solution to this issue is simple: separate preparing to suspend from
actually writing the image. In the preparation step, ensure, so far as
you are able, that there will be sufficient memory and sufficient
storage to complete the process, and don't write anything or do any
atomic copying until after that has been done.

The only valid objection I can think of is that you can't know for
certain prior to doing the atomic copy how much memory  storage will be
needed for allocations by driver suspend methods. That can be addressed
by a simple extension of the driver model, where in drivers could report
how many pages they will need. (If slab will be needed, the worst case
can be assumed). Rafael's notify patches (recently posted) also help in
that area.

Once processes are frozen, all significant memory usage can be accounted
for, because the process doing the suspending will be the only one
allocating memory.

2) Limit on image size.

The current implementation limits the size of an image to an absolute
maximum of half the amount of ram. This is certainly an improvement over
the old days where it sought to free everything it could, but it's still
not good enough. Current memory freeing code doesn't free the exact
amount requested; often far more than has been requested is freed. This
does not only result in a smaller image. It also means the system is
proportionately less responsive on resume at whatever stage that those
pages are needed again. A full image is certainly not needed by
everyone. Those with huge amounts of memory, very fast storage devices
or particular memory usage patterns may, quite rightly, not want to
store the whole lot in an image. This doesn't mean, however, that those
who want or need (from their perspective) a full image of memory
shouldn't be able to have it. It just adds to the argument for making it
tunable (which swsusp has done too).

3) Lack of provision for tuning to individual needs.

Swsusp historically included very little provision whatsoever for the
user to tune their configuration. This has recently begun to change, and
I applaud that. But it needs to go further. Suspending to disk is not a
one-size-fits-all situation. People have different hardware
configurations, with the result being that some people benefit from
compression while others do better without it. Some people want
encryption in a particular configuration while others don't care about
encryption at all. Some people want to limit the image size, others
don't. Sometimes a user might want to reboot instead of powering down
(dual booting). All of this should be doable, without having to hack the
code or recompile the kernel, and should be as simple as possible.
Suspend2, via its /sys/power/suspend2 interface and hibernate-script
porcelain, makes this easy.

4) No support for multiple swap devices / non swap storage.

Until recently, [u]swsusp supported a single swap partition only.
Support for a swap file has been added, but [u]swsusp still supports
only one swap device at a time. For most people, this is adequate, but
this doesn't mean everyone should be forced to fit this mould.

[u]swsusp also lacks support for storage to non-swap. Particularly in
systems that rely on swap for normal activity, this can make [u]swsusp
less reliable. The amount of swap available varies according to
workload, so sometimes the user will be unable to suspend. To address
this raciness/competition against other swap usage, Suspend2 supports
writing to a generic file, either a partition or a file on an ordinary
partition.

B) Further advantages of Suspend2.
==

1) Improvements over swsusp.


a) Modular design.

Parts of Suspend2 implement support for storing an image in swap or in a
file, using cryptoapi for compression and/or encryption and talking to a
userspace user interface via a netlink socket. Suspend2 works just fine
without CONFIG_SWAP, CONFIG_NET and/or CONFIG_CRYPTOAPI, however,
because it uses a modular design wherein support for these subsystems is
abstracted 

Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)

2007-04-25 Thread Nigel Cunningham
Hi.

On Wed, 2007-04-25 at 07:29 +, Pavel Machek wrote:
 Hi!
 
   I absolutely detest all suspend-to-disk crap. Quite frankly, I hate 
   the whole thing. I think they've _all_ caused problems for the true 
   suspend (suspend-to-ram), and the last thing I want to see is three or 
   four different suspend-to-disk implementations.  So unlike Ingo, I 
   don't think let's just integrate them all side-by-side and maintain 
   them and look who wins is really a good idea.
  
   How many different magic ioctl's does the thing introduce? Is it 
   really just *two* entry-points (and how simple are they, 
   interface-wise), and nothing else?
  
  userspace-driven-suspend is already in the kernel, today. So it's not 
  really two versions side by side doing the same thing, but more of:
  
 A B C + D E F G H
  
  where ABC is used by the uswsusp code today, and ABCDEFGH is used by 
  suspend2. So any suspend2 merge would largely be about adding DEFGH. 
 
 Actually, we have 'D H' in kernel, today. It is called swsusp...
 (Encryption, swapFile support and Graphical progress are missing from
 today's kernel.)

Along with a lot of other things (see my Reasons to merge Suspend2
email from earlier in the day).

  My original mail was about the following thing: i tried the suspend2 
  patch (which just makes echo disk  /sys/power/state work as expected, 
  as long as you give the booting up kernel image an idea about where the 
 
 ..and it means that 'echo disk  ...' should work w/o suspend2 patch,
 too. (Just try it). You'll miss compression part, but that provides
 only small speedup.

Please don't spread misinformation to support your case. LZF compression
(which is what all Suspend2 users use AFAIK) generally doubles the speed
of your cycle.

Nigel


signature.asc
Description: This is a digitally signed message part


Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)

2007-04-25 Thread Nigel Cunningham
Hi.

On Wed, 2007-04-25 at 10:48 +0200, Xavier Bestel wrote:
 On Wed, 2007-04-25 at 07:23 +, Pavel Machek wrote:
   I absolutely detest all suspend-to-disk crap. Quite frankly, I hate the 
   whole thing. I think they've _all_ caused problems for the true suspend 
   (suspend-to-ram), and the last thing I want to see is three or four 
  
  Well, it is a bit more complex than that.
  
  suspend-to-disk is a workaround for
  
  'suspend-to-ram eats too much power' (plus some details like
  being able to replace battery).
  
  suspend-to-ram is a workaround for
  
  'idle machine takes way too much power' (plus some details
  like don't spin the disk so that machine is safe to carry).
 
 I think it depends on who you ask. I personally think that suspend-to-
 $youchoose is a workaround for the slowness of system startup. I never
 turn off my laptop, I just suspend it.
 
 (And guess what, it uses APM and suspend is really faster and way more
 reliable than each kernel implementation I could try).

If you tried Suspend2 and had problems with reliability, please send me
logs. I'll do all I can to help. (I have to qualify it a bit, because
I'm not able to fix drivers, but if it's a Suspend2 issue, tell me and
I'll fix it).

Regards,

Nigel


signature.asc
Description: This is a digitally signed message part


Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)

2007-04-25 Thread Nigel Cunningham
Hi.

On Wed, 2007-04-25 at 11:07 +0200, Xavier Bestel wrote:
 On Wed, 2007-04-25 at 18:50 +1000, Nigel Cunningham wrote:
   (And guess what, it uses APM and suspend is really faster and way more
   reliable than each kernel implementation I could try).
  
  If you tried Suspend2 and had problems with reliability, please send me
  logs. I'll do all I can to help. (I have to qualify it a bit, because
  I'm not able to fix drivers, but if it's a Suspend2 issue, tell me and
  I'll fix it).
 
 Does suspend2 work with APM ? After much trying, I think now the ACPI
 implementation of my laptop (a vintage Compaq Armada 1700) is busted,
 only APM works.

It should do. If you set the powerdown method to 0, it will use
machine_power_off() instead of trying to use acpi, fall back to
machine_halt() if that fails and lastly (should not be needed) a
while(1) cpu_relax() loop.

 AFAIR the problem with suspend2 was that it didn't poweroff some parts
 of the laptop (the led of the wifi pcmcia card was on, and the lcd light
 was on too), but that was last year. Kernel's suspend kind of worked but
 didn't resume (no reaction on button press). As I tried all this last
 year, I may have forgotten some things.

The code to poweroff those parts will be dependent on the drivers
(assuming I'm making the right calls). If it's something where swsusp
works and suspend2 doesn't, it will be because I'm doing something
wrong. If they both don't do the right thing, then it's probably the
driver.

 Honestly, I like this laptop when it works flawlessly, so I don't see
 many reasons to try *susp* again. I'll do it when I'm bored, just not
 today.

Okay :) Just let me know if I can help.

Nigel


signature.asc
Description: This is a digitally signed message part


Re: [PATCH] Use more gcc extensions in the Linux headers

2007-03-09 Thread Nigel Cunningham
Hi.

On Fri, 2007-03-09 at 23:03 -0500, [EMAIL PROTECTED] wrote:
 On Sat, 10 Mar 2007 09:57:32 +1100, Rusty Russell said:
 
  +/* GCC is awesome. */
   #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0])
\
  + sizeof(typeof(int[1 - 2*!!__builtin_types_compatible_p(typeof(arr), \
   typeof(arr[0]))]))*0)
 
 -/* GCC is awesome. */
 +/* GCC leaves me speechless. */

A speechless Rusty would be horrible. That said, it would be nice if the
comment was something more like the normal Rusty pearl of wisdom. I
understand the first part, but have no idea was + sizeof(typeof(int[
does...

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-10 Thread Nigel Cunningham
Hi.

On Sat, 2007-02-10 at 23:20 +0100, Rafael J. Wysocki wrote:
 Hi,
 
 On Saturday, 10 February 2007 20:38, Pavel Machek wrote:
  Hi!
  
I don't think this is already done (feel free to correct me if I'm
wrong)..

Can we start to NAK new drivers that don't have proper power management
implemented? There really is no excuse for writing a new driver and not
putting .suspend and .resume methods in anymore, is there?
   
   to a large degree, a device driver that doesn't suspend is better than
   no device driver at all, right?
   now.. if you want to make the core warn about it, that's very fair
  
  Well, driver that is broken on SMP is arguably better than no driver
  at all, yet we'd probably avoid merging that. It would be nice to
  start including suspend in 'must work' list...
 
 What about this:
 
 If the device requires that, implement .suspend and .resume or at least
 define .suspend that will always return -ENOSYS (then people will know they
 have to unload the driver before the suspend).  Similarly, if you aren't sure
 whether or not the device requires .suspend and .resume, define .suspend that
 will always return -ENOSYS.

If your device requires power management, and you know it requires power
management, why not just implement power management? Doing -ENOSYS
instead is like saying -ESPAMMEBECAUSEIMLAZY.

Let me put it another way: People keep talking about Linux being ready
for the desktop. To me at least (but I dare say for lots of other people
too), being ready for the desktop means that things just work, without
having to recompile kernels or bug driver authors or wait twelve
months. 

And it means that doing a bare minimum isn't enough. We keep claiming
that Open Source is better than Proprietary software. If we accept
half-pie jobs of implementing support for anything - driver power
management support or hibernation support or whatever - as 'good
enough', we're undercutting that argument. Linux's power management
support should - as far as we're able - be at least as good as that
other operating system's and preferably way, way better.

-ENOSYS is just not acceptable.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Sun, 2007-02-11 at 22:52 +0100, Willy Tarreau wrote:
 On Sun, Feb 11, 2007 at 12:31:14PM -0600, Robert Hancock wrote:
  Willy Tarreau wrote:
  Nigel, don't take it as a personal offense, but I think it is a very
  centric view of Linux usages. Where I work, Linux is used a lot on
  servers and appliances. It is used for mail relays, HTTP proxies,
  anti-viruses, firewalls, routers, load balancers, UTM, SSH relays,
  etc... Nobody would ever want to enable power management on those
  machines, let alone suspend which would cause a major havoc, would
  the system decide to enter suspend for any reason.
  
  Many people also have Linux on their notebooks, but as a dual-boot. You
  read the word ? dual-boot. It means that they cleanly shutdown their
  system every time they don't use it anymore, and they won't know what
  OS they'll use next time.
  
  I've never heard anyone there complaining oh, I'm fed up with this
  boring boot, I always have to wait 30 seconds when I need to do
  something, I wish I could suspend and resume. It is considered the
  normal way of using their PCs.
  
  I think your experience is rather different than that of Joe Average 
  User who doesn't frequent kernel lists, and also I think you'll find 
  that for a lot of Linux laptop users that don't use supend, the reason 
  is that it doesn't work reliably, quite often due to driver issues.
 
 I would believe it if I knew people using suspend/resume on the other OS.
 But that's not the case either. Also, it happens that with today's RAM
 sizes, suspend-to-disk then resume can be several times slower than a
 clean fresh boot. When you have 1 GB to write at 20 MB/s, it takes 50
 seconds to shut down, and as much to restart. Compare this to 5-10
 seconds for a shutdown and 30-50 seconds for a cold boot, and it might
 give you another clue why there are people not interested in such a
 feature.

I'm using M$ hibernation and Suspend2 to dual boot on our desktop (dtv
card that Linux doesn't support well yet), and I know other Suspend2
users doing the same. It's made earier by the fact that Suspend2 lets
you reboot instead of powering down.

As to comparing the speed with the time to boot, your estimates are way
out. Both will of course vary with the harddrive and cpu speeds and
compression qualities of the image, but with Suspend2, I'm seeing speeds
more in the range of 40-100MB/s, and even had a resport of 160MB/s a
couple of days ago. The rule of thumb I use is:

Run hdparm -t (or equiv) on the drive you'll be using:

[EMAIL PROTECTED]:~$ sudo hdparm -t /dev/hda

/dev/hda:
 Timing buffered disk reads:  120 MB in  3.02 seconds =  39.70 MB/sec

Then calculate RAM_IN_MB / 2 / HDPARM_RESULT = seconds to read/write
image.

In my case: 1024 / 2 / 39.7 = approx 12 seconds. The / 2 is because with
LZF compression, you normally get about 50% compression.

I think the mean reason some people aren't interested in suspend to disk
is because of myths (if you'll excuse the term) like the one you've put
above. Of course that values you give were more accurate for swsusp and
uswsusp until recently, but Suspend2 has had async I/O and compression
for years, so all I can really do is encourage you to try again.

Of course there's another factor you're not taking into account: With
suspending to disk, you don't have to close and reopen documents or shut
down and restart applications. The time to do that should be factored
into the non-suspend-to-disk time to compare apples with apples.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Sun, 2007-02-11 at 00:45 +0100, Tilman Schmidt wrote:
 Am 10.02.2007 23:37 schrieb Nigel Cunningham:
  If your device requires power management, and you know it requires power
  management, why not just implement power management? Doing -ENOSYS
  instead is like saying -ESPAMMEBECAUSEIMLAZY.
 
 Like it or not, power management is far from trivial, and people
 writing device drivers have limited resources. Calling them lazy
 does not help that in the least. If you try to put pressure on them
 by refusing to merge their work as long as it doesn't provide this
 or that functionality, you *may* end up with a few drivers having
 that functionality which otherwise wouldn't, but you *will* also
 end up with a number of drivers never making it into the kernel
 because their authors just have to give up.

It's not that complex. All we're really talking about is a bit of extra
code to cleanup and configure hardware state; things that the driver
author already knows how to do. S3 might require a bit more
initialisation if firmware needs to be reloaded or more extensive
configuration needs to be done, but if there's firmware to be loaded,
there is a reasonably good probability that we loaded it from Linux to
start with anyway.

 Also, in your argument you neglected a few cases:
 - What if my device does not require power management?

Then you as a generic routine that does nothing but return success
(potentially shared with other drivers that are in the same situation).

 - What if I don't know whether my device requires power management?

The questions are straight forward: Is there hardware state that needs
to be configured if you've just booted the computer and nothing else has
touched it? If so, that needs to be done in a resume method. Do you need
to clean up state prior to doing the things in the resume method, or
otherwise do things to quiesce the driver? If so, they will need to be
done in the suspend method. The result will be roughly similar to what
you do for module load/unload, except maybe less complete in some cases.

 - What if I know my device would require power management, but don't
   know how to implement it?

I've just told you above :) Now you know!

  Let me put it another way: People keep talking about Linux being ready
  for the desktop. To me at least (but I dare say for lots of other people
  too), being ready for the desktop means that things just work, without
  having to recompile kernels or bug driver authors or wait twelve
  months. 
 
 Exactly.
 
  And it means that doing a bare minimum isn't enough. We keep claiming
  that Open Source is better than Proprietary software. If we accept
  half-pie jobs of implementing support for anything - driver power
  management support or hibernation support or whatever - as 'good
  enough', we're undercutting that argument. Linux's power management
  support should - as far as we're able - be at least as good as that
  other operating system's and preferably way, way better.
  
  -ENOSYS is just not acceptable.
 
 Your argument falls down the moment you consider the alternative:
 not merging the driver means that the device won't work at all.
 (Given that out-of-tree drivers are actively discouraged, to put
 it mildly.) That's arguably farther from desktop readiness than
 a device not supporting power management.

I disagree (but I would, of course!). If we apply your logic
consistently, we should merge the driver as soon as any code is written
for it (anything is better than nothing). I'm simply arguing that a
driver that handling suspend and resume should be as much of a
requirement as not causing memory corruption or such like are.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Sun, 2007-02-11 at 01:44 +0100, Rafael J. Wysocki wrote:
  Well, it's probably more acceptable than silently doing nothing and the 
  device failing or locking up the machine on resume, but I couldn't agree 
  more that it's not what we want to be encouraging. Perfect may be the 
  enemy of the good, but works except no power management is hardly what 
  I would call good these days, more like pretty sloppy..
 
 I think there are situations in which it can be justified, like:
 - The driver is not entirely finished, but we want to merge it early, because
 of many potential users,
 - The driver has only a few users who aren't interested in the suspend/resume
 functionality,

How do you determine that? How many users have to want suspend/resume
functionality before you say Ok. It has to be done now?

 - The device is undocumented and we don't know how to make it handle the
 suspend/resume (we may learn that in the future or not).

If we know how to initialise/cleanup, we know a good portion of what is
needed for suspend/resume. Sure, for some video chipsets, you need more
(you need to know how to reprogram the whole thing after S3), but
they're the exception. Yes, there are other cases. But on the whole,
we're not talking about esoteric knowledge.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
On Sun, 2007-02-11 at 01:27 +0100, Rafael J. Wysocki wrote:
 On Sunday, 11 February 2007 00:45, Tilman Schmidt wrote:
  Am 10.02.2007 23:37 schrieb Nigel Cunningham:
   If your device requires power management, and you know it requires power
   management, why not just implement power management? Doing -ENOSYS
   instead is like saying -ESPAMMEBECAUSEIMLAZY.
  
  Like it or not, power management is far from trivial, and people
  writing device drivers have limited resources. Calling them lazy
  does not help that in the least. If you try to put pressure on them
  by refusing to merge their work as long as it doesn't provide this
  or that functionality, you *may* end up with a few drivers having
  that functionality which otherwise wouldn't, but you *will* also
  end up with a number of drivers never making it into the kernel
  because their authors just have to give up.
  
  Also, in your argument you neglected a few cases:
  - What if my device does not require power management?
  - What if I don't know whether my device requires power management?
  - What if I know my device would require power management, but don't
know how to implement it?
 
 Plus:
 - What if I'm planning to implement the power managemet, but not just right
 now?

Why not right now?
 
Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Sun, 2007-02-11 at 07:46 +0100, Willy Tarreau wrote:
 Hi Nigel,
 
 On Sun, Feb 11, 2007 at 09:37:06AM +1100, Nigel Cunningham wrote:
  On Sat, 2007-02-10 at 23:20 +0100, Rafael J. Wysocki wrote:
 (...)
   What about this:
   
   If the device requires that, implement .suspend and .resume or at least
   define .suspend that will always return -ENOSYS (then people will know 
   they
   have to unload the driver before the suspend).  Similarly, if you aren't 
   sure
   whether or not the device requires .suspend and .resume, define .suspend 
   that
   will always return -ENOSYS.
  
  If your device requires power management, and you know it requires power
  management, why not just implement power management? Doing -ENOSYS
  instead is like saying -ESPAMMEBECAUSEIMLAZY.
 
 No, it means Not implemented because I don't want to screw that driver with
 something I'm not expert in. And it also means Other people will quickly
 notice it and will know how to fix this if they really need it.

Ok, that was a bit rough. Sorry.

At the same time though, we were talking about new drivers. If you know
enough to implement the rest of the driver, surely you know enough to
implement the power management part too. (See my previous comment about
the similarities to module load/unload code).

  Let me put it another way: People keep talking about Linux being ready
  for the desktop. To me at least (but I dare say for lots of other people
  too), being ready for the desktop means that things just work, without
  having to recompile kernels or bug driver authors or wait twelve
  months. 
 
 It's *one* usage of Linux. For this usage, you could also suggest to stop
 supporting UP kernels and always build everything with SMP enabled since
 more and more often, people will use multi-core systems. It will exempt
 the users from upgrading their kernels when they replace their CPU. We
 could also try to chase down all the drivers which do not correctly behave
 when the CPU switches to a lower frequency.
 
  And it means that doing a bare minimum isn't enough. We keep claiming
  that Open Source is better than Proprietary software. If we accept
  half-pie jobs of implementing support for anything - driver power
  management support or hibernation support or whatever - as 'good
  enough', we're undercutting that argument. Linux's power management
  support should - as far as we're able - be at least as good as that
  other operating system's and preferably way, way better.
  
  -ENOSYS is just not acceptable.
 
 Nigel, don't take it as a personal offense, but I think it is a very
 centric view of Linux usages. Where I work, Linux is used a lot on
 servers and appliances. It is used for mail relays, HTTP proxies,
 anti-viruses, firewalls, routers, load balancers, UTM, SSH relays,
 etc... Nobody would ever want to enable power management on those
 machines, let alone suspend which would cause a major havoc, would
 the system decide to enter suspend for any reason.

I agree.

 Many people also have Linux on their notebooks, but as a dual-boot. You
 read the word ? dual-boot. It means that they cleanly shutdown their
 system every time they don't use it anymore, and they won't know what
 OS they'll use next time.

Not necessarily. I dual boot our desktop machine, and hibernate both,
using grub to select with OS to run.

 I've never heard anyone there complaining oh, I'm fed up with this
 boring boot, I always have to wait 30 seconds when I need to do
 something, I wish I could suspend and resume. It is considered the
 normal way of using their PCs.
 
 So globally, those hundreds of notebooks, workstations and servers
 will not be customers of the suspend code any time soon. It would
 be a shame to deprive them from working drivers. You must just
 accept that a lot of people are not interested in your work. It's
 the same for all of us here. I know that a lot of people are not
 interested in 2.4 anymore and I'm perfectly fine with that. I'm
 not asking 2.6 driver authors to ensure that their driver is easy
 to backport for instance.

Neither am I. I'm just asking that new drivers have power management as
standard.

 What I really think would be a clean solution would be sort of
 a capability. Either the driver *is* suspend/resume-capable, and
 the system can be suspended. Or it is not, and the system must
 refuse to suspend. It should not be a problem to proceed like
 this because drivers which will not support suspend will mainly
 be those which will not have to. And if a user occasionnaly
 complains that one driver does not support it, at least you will
 have a good argument against its author to implement suspend.

Yes, but why should the user have to complain to start with?

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Sun, 2007-02-11 at 12:13 +, Matthew Garrett wrote:
 On Sun, Feb 11, 2007 at 07:54:04AM +0100, Willy Tarreau wrote:
 
  instead of modifying all drivers to explicitly state that they don't support
  it, we should start with a test of the NULL pointer for .suspend which 
  should
  mean exactly the same without modifying the drivers. I find it obvious that
  a driver which does provide a suspend function will not support it. And if
  some drivers (eg /dev/null) can support it anyway, it's better to change
  *those* drivers to explicitly mark them as compatible.
 
 No, that doesn't work. In the absence of suspend/resume methods, the PCI 
 layer will implement basic PM itself. In some cases, this works. In 
 others, it doesn't. There's no way to automatically determine which is 
 which without modifying the drivers.

I think we have it backwards there. Power management support for a
driver should always start with the driver itself. If there's a generic
routine that can be used for the bus, the driver should explicitly set
the routine to the generic routine.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Sun, 2007-02-11 at 19:53 +0100, Rafael J. Wysocki wrote:
  Having drivers explicitly marked as to whether they are safe is a good 
  kernel
  feature; what to do if they're not is policy.
 
 That's true, but I assume that the people who opt for doing that are also
 willing to take part in the review of the drivers. :-)

Absolutely :)

 Well, I don't think so.  Let's estimate the number of drivers that define
 .resume() right now:
 
 $ grep -I -l -r '.resume =' linux-2.6.20/drivers/ | wc
 102 1024169

I think the '.resume =' doesn't help - some have tabs. I ran '\.resume'
and got 351.

It would be interesting to see how many struct pci_driver etc instances
lack resume methods.

Regards,

Nige

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Sun, 2007-02-11 at 21:02 +, Alan wrote:
   If the device requires that, implement .suspend and .resume or at least
   define .suspend that will always return -ENOSYS (then people will know 
   they
   have to unload the driver before the suspend).  Similarly, if you aren't 
   sure
   whether or not the device requires .suspend and .resume, define .suspend 
   that
   will always return -ENOSYS.
  
  Sounds ok to me. Where should this text go?
  Documentation/SubmittingDrivers ?
 
 And testing/submitting drivers, perhaps with additional text in that to
 make it clear we want suspend/resume support or good excuses
 
 Please verify your driver correctly handles suspend and resume. If it
 does not your patch submission is likely to be suspended and only resume
 when the driver correctly handles this feature

Maybe make it explicit that testing should be done for both suspend to
ram and to disk, and with the following usage scenarios as applicable?

- built in;
- modular, loaded while suspending but not loaded prior to resume from
disk;
- modular, loaded while suspending and loaded prior to resume from disk;

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Sun, 2007-02-11 at 23:46 +0100, Willy Tarreau wrote:
  On Sun, 2007-02-11 at 22:52 +0100, Willy Tarreau wrote:
   On Sun, Feb 11, 2007 at 12:31:14PM -0600, Robert Hancock wrote:
Willy Tarreau wrote:
Nigel, don't take it as a personal offense, but I think it is a very
centric view of Linux usages. Where I work, Linux is used a lot on
servers and appliances. It is used for mail relays, HTTP proxies,
anti-viruses, firewalls, routers, load balancers, UTM, SSH relays,
etc... Nobody would ever want to enable power management on those
machines, let alone suspend which would cause a major havoc, would
the system decide to enter suspend for any reason.

Many people also have Linux on their notebooks, but as a dual-boot. You
read the word ? dual-boot. It means that they cleanly shutdown their
system every time they don't use it anymore, and they won't know what
OS they'll use next time.

I've never heard anyone there complaining oh, I'm fed up with this
boring boot, I always have to wait 30 seconds when I need to do
something, I wish I could suspend and resume. It is considered the
normal way of using their PCs.

I think your experience is rather different than that of Joe Average 
User who doesn't frequent kernel lists, and also I think you'll find 
that for a lot of Linux laptop users that don't use supend, the reason 
is that it doesn't work reliably, quite often due to driver issues.
   
   I would believe it if I knew people using suspend/resume on the other OS.
   But that's not the case either. Also, it happens that with today's RAM
   sizes, suspend-to-disk then resume can be several times slower than a
   clean fresh boot. When you have 1 GB to write at 20 MB/s, it takes 50
   seconds to shut down, and as much to restart. Compare this to 5-10
   seconds for a shutdown and 30-50 seconds for a cold boot, and it might
   give you another clue why there are people not interested in such a
   feature.
  
  I'm using M$ hibernation and Suspend2 to dual boot on our desktop (dtv
  card that Linux doesn't support well yet), and I know other Suspend2
  users doing the same. It's made earier by the fact that Suspend2 lets
  you reboot instead of powering down.
  
  As to comparing the speed with the time to boot, your estimates are way
  out. Both will of course vary with the harddrive and cpu speeds and
  compression qualities of the image, but with Suspend2, I'm seeing speeds
  more in the range of 40-100MB/s, and even had a resport of 160MB/s a
  couple of days ago. The rule of thumb I use is:
  
  Run hdparm -t (or equiv) on the drive you'll be using:
  
  [EMAIL PROTECTED]:~$ sudo hdparm -t /dev/hda
  
  /dev/hda:
   Timing buffered disk reads:  120 MB in  3.02 seconds =  39.70 MB/sec
  
  Then calculate RAM_IN_MB / 2 / HDPARM_RESULT = seconds to read/write
  image.
  
  In my case: 1024 / 2 / 39.7 = approx 12 seconds. The / 2 is because with
  LZF compression, you normally get about 50% compression.
  
  I think the mean reason some people aren't interested in suspend to disk
  is because of myths (if you'll excuse the term) like the one you've put
  above. Of course that values you give were more accurate for swsusp and
  uswsusp until recently, but Suspend2 has had async I/O and compression
  for years, so all I can really do is encourage you to try again.
 
 Well, I agree that you give some good arguments here.
 
  Of course there's another factor you're not taking into account: With
  suspending to disk, you don't have to close and reopen documents or shut
  down and restart applications. The time to do that should be factored
  into the non-suspend-to-disk time to compare apples with apples.
 
 Hmm sorry, but we don't have the same usages of notebooks. For no reason
 would I keep documents open, for two reasons :
 
   - when I shutdown my notebook, it is to move from one customer to
 home/company/another customer. There's no related work anyway, the
 network will have changed and I'll have to switch nearly all of my
 apps anyway. So using suspend just to save one reboot is not worth
 it (for me) IMHO.

The network configuration utilities can help there. In addition,
Suspend2 preserves the commandline you used to boot with
(/sys/power/suspend2/resume_commandline), so you can use a combination
of slightly varying grub entries (I have one for not starting ath0 and
one for starting it) and scripts to do different things in different
environments. The resume_commandline is writable, so can be cleared
after usage if there were anything sensitive there.

   - I would certainly not keep open documents that are on crypted FS
 while I travel. Otherwise, it would be a total waste of time to
 enter my passphrase everytime I need to access them ! Some might
 argue that it would save me a lot of time, providing me with the
 ability to type my passphrase only once a month, but that's not

Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Mon, 2007-02-12 at 02:57 +0400, Manu Abraham wrote:
 On 2/12/07, Nigel Cunningham [EMAIL PROTECTED] wrote:
 
  Neither am I. I'm just asking that new drivers have power management as
  standard.

 What if the hardware doesn't support power management ?

You would still want to do the cleanup and configuration that you'd do
for module load/unload.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Mon, 2007-02-12 at 00:16 +0100, Rafael J. Wysocki wrote:
 On Monday, 12 February 2007 00:10, Nigel Cunningham wrote:
  Hi.
  
  On Sun, 2007-02-11 at 21:02 +, Alan wrote:
 If the device requires that, implement .suspend and .resume or at 
 least
 define .suspend that will always return -ENOSYS (then people will 
 know they
 have to unload the driver before the suspend).  Similarly, if you 
 aren't sure
 whether or not the device requires .suspend and .resume, define 
 .suspend that
 will always return -ENOSYS.

Sounds ok to me. Where should this text go?
Documentation/SubmittingDrivers ?
   
   And testing/submitting drivers, perhaps with additional text in that to
   make it clear we want suspend/resume support or good excuses
   
   Please verify your driver correctly handles suspend and resume. If it
   does not your patch submission is likely to be suspended and only resume
   when the driver correctly handles this feature
  
  Maybe make it explicit that testing should be done for both suspend to
  ram and to disk, and with the following usage scenarios as applicable?
  
  - built in;
  - modular, loaded while suspending but not loaded prior to resume from
  disk;
  - modular, loaded while suspending and loaded prior to resume from disk;
 
 I think we should state the general rule in Documentation/SubmittingDrivers
 and give more details in Documentation/power/devices.txt

Sounds good.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Mon, 2007-02-12 at 00:21 +0100, Pavel Machek wrote:
 Hi!
 
 define .suspend that will always return -ENOSYS (then people will 
 know they
 have to unload the driver before the suspend).  Similarly, if you 
 aren't sure
 whether or not the device requires .suspend and .resume, define 
 .suspend that
 will always return -ENOSYS.

Sounds ok to me. Where should this text go?
Documentation/SubmittingDrivers ?
   
   And testing/submitting drivers, perhaps with additional text in that to
   make it clear we want suspend/resume support or good excuses
   
   Please verify your driver correctly handles suspend and resume. If it
   does not your patch submission is likely to be suspended and only resume
   when the driver correctly handles this feature
  
  Maybe make it explicit that testing should be done for both suspend to
  ram and to disk, and with the following usage scenarios as
   applicable?
 
 Well, for many people s2ram does not work even today... so requiring
 them to test it is slightly draconian.
 
  - built in;
  - modular, loaded while suspending but not loaded prior to resume from
  disk;
 
 These two should be equivalent.

No. The differences are:

Built in: The initcalls will have run, but the driver may or may not
actually have been used, depending on whether it's used before we start
the resume. It should probably be tested with both having been used and
not having been used.
Modular, loaded prior to suspending but not prior to resuming: At resume
time, will still be in whatever config the bios puts it in. No Linux
driver code will have touched it.
Modular, loaded prior to suspending and resuming: Should be equivalent
to built in (module initcalls will have run), but may vary if there's
some difference in code/timing between being a module and built in.
(This shouldn't happen, but that's the point to testing).

  - modular, loaded while suspending and loaded prior to resume from disk;
 
 Ok.. but I'm not sure how many people will actually test it _that_
 thoroughly. Try to test it is good enough for a first version. When
 suspend is in better shape, we can ask for more.

I'd prefer to ask for what should be done from the start. Will we expect
people to go back and retest if we change the rules, or do we prefer
them to complain You didn't adequately point out the testing I needed
to do, and I got all these complaints from my users!

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Mon, 2007-02-12 at 03:25 +0400, Manu Abraham wrote:
 On 2/12/07, Nigel Cunningham [EMAIL PROTECTED] wrote:
  Hi.
 
  On Mon, 2007-02-12 at 02:57 +0400, Manu Abraham wrote:
   On 2/12/07, Nigel Cunningham [EMAIL PROTECTED] wrote:
  
Neither am I. I'm just asking that new drivers have power management as
standard.
 
   What if the hardware doesn't support power management ?
 
  You would still want to do the cleanup and configuration that you'd do
  for module load/unload.
 
 By adding dummy functions, wouldn't that just look awkward ?

If all you need to do is say 'I don't need to do anything' and we have a
shared function that does that, all we're talking about doing is adding
to your struct pci_device (or whatever)

.resume = generic_empty_resume;

To me at least, that doesn't look awkward, and says cleanly and clearly
that you've checked things over and decided you know what's required.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Mon, 2007-02-12 at 00:29 +0100, Rafael J. Wysocki wrote:
  On Sun, 2007-02-11 at 01:44 +0100, Rafael J. Wysocki wrote:
Well, it's probably more acceptable than silently doing nothing and the 
device failing or locking up the machine on resume, but I couldn't 
agree 
more that it's not what we want to be encouraging. Perfect may be the 
enemy of the good, but works except no power management is hardly 
what 
I would call good these days, more like pretty sloppy..
   
   I think there are situations in which it can be justified, like:
   - The driver is not entirely finished, but we want to merge it early, 
   because
   of many potential users,
   - The driver has only a few users who aren't interested in the 
   suspend/resume
   functionality,
  
  How do you determine that? How many users have to want suspend/resume
  functionality before you say Ok. It has to be done now?
 
 That depends on what the driver author tells us.  If he says there's only one
 such device in the world and it needs a Linux drivers, but the system in
 question will never be suspended, that will be fine, I think.  There are such
 cases already and I see no reason why there won't be any more in the future.
 
   - The device is undocumented and we don't know how to make it handle the
   suspend/resume (we may learn that in the future or not).
  
  If we know how to initialise/cleanup, we know a good portion of what is
  needed for suspend/resume. Sure, for some video chipsets, you need more
  (you need to know how to reprogram the whole thing after S3), but
  they're the exception. Yes, there are other cases. But on the whole,
  we're not talking about esoteric knowledge.
 
 No, in general this is not _that_ simple.  Please browse the archives of
 bcm43xx-dev, for example.

Yeah. The problems of not having documentation + having to reassociate
and so on.

 While I agree that the support for suspend and resume _is_ generally 
 important,
 I also admit that there are situations in which it doesn't matter and there 
 are
 many people who won't care a whit for it.

Ok, but that's the exception, right? Not the rule? So in those cases, an
exception is made.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Mon, 2007-02-12 at 00:38 +0100, Willy Tarreau wrote:
 On Mon, Feb 12, 2007 at 10:18:42AM +1100, Nigel Cunningham wrote:
 [snip]
   Hmm sorry, but we don't have the same usages of notebooks. For no reason
   would I keep documents open, for two reasons :
   
 - when I shutdown my notebook, it is to move from one customer to
   home/company/another customer. There's no related work anyway, the
   network will have changed and I'll have to switch nearly all of my
   apps anyway. So using suspend just to save one reboot is not worth
   it (for me) IMHO.
  
  The network configuration utilities can help there. In addition,
  Suspend2 preserves the commandline you used to boot with
  (/sys/power/suspend2/resume_commandline), so you can use a combination
  of slightly varying grub entries (I have one for not starting ath0 and
  one for starting it) and scripts to do different things in different
  environments. The resume_commandline is writable, so can be cleared
  after usage if there were anything sensitive there.
 
 OK, I see there are features to make life easier when I decide to use
 suspend. But it looks like that using suspend is the goal and dealing
 with the constraints is a lot of work and I'm still far from being
 convinced that it would provide me advantage.

Ok. I don't feel like I have to convince everyone :)

 - I would certainly not keep open documents that are on crypted FS
   while I travel. Otherwise, it would be a total waste of time to
   enter my passphrase everytime I need to access them ! Some might
   argue that it would save me a lot of time, providing me with the
   ability to type my passphrase only once a month, but that's not
   what I'm looking for :-)
  
  People are using Suspend2 with encryption today (I'm not sure about
  uswsusp). Some of them have set things up so you need to use a
  passphrase or usb key to resume, and the image itself is of course
  encrypted too.
 
 Unless I'm mistaken, I have to type the passphrase twice then :
   - once at suspend
   - once at resume
 
 which is once more per boot than what I'm doing on loop-aes.

I'm not sure. I don't use encryption myself, so I don't understand all
the fine details. I just know that there are people out there using
encryption, loop-aes, dmsetup and all that sort of stuff. I don't have
to worry about it because they use an initrd/ramfs to do whatever they
need to do to provide access to the device on which the image is found,
then

echo /dev/whatever_funny_device  /sys/power/suspend2/resume2
echo  /sys/power/suspend2/do_resume

  You could also close the document and not the app. Or both and just get
  the benefit of having the app in page cache post-resume.
 
 I'm not much convinced by the advantage of reading 500 MB on disk to have
 emacs in hot cache :-)

Neither am I! Presumably you'd have a lot more than emacs in there
though :) You could always switch to vim! (*ducks*)

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Mon, 2007-02-12 at 00:41 +0100, Rafael J. Wysocki wrote:
  I'm using M$ hibernation and Suspend2 to dual boot on our desktop (dtv
  card that Linux doesn't support well yet), and I know other Suspend2
  users doing the same. It's made earier by the fact that Suspend2 lets
  you reboot instead of powering down.
 
 Well, I don't know why you're saying it's a special capability of suspend2.
 Even the old swsusp has been able to do this since I can remember. ;-)

It does?! I just did cat /sys/power/disk and it only says platform. How
do you make swsusp reboot instead of powering down?

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Mon, 2007-02-12 at 00:50 +0100, Rafael J. Wysocki wrote:
 On Monday, 12 February 2007 00:47, Nigel Cunningham wrote:
  Hi.
  
  On Mon, 2007-02-12 at 00:41 +0100, Rafael J. Wysocki wrote:
I'm using M$ hibernation and Suspend2 to dual boot on our desktop (dtv
card that Linux doesn't support well yet), and I know other Suspend2
users doing the same. It's made earier by the fact that Suspend2 lets
you reboot instead of powering down.
   
   Well, I don't know why you're saying it's a special capability of 
   suspend2.
   Even the old swsusp has been able to do this since I can remember. ;-)
  
  It does?! I just did cat /sys/power/disk and it only says platform. How
  do you make swsusp reboot instead of powering down?
 
 echo reboot  /sys/power/disk  echo disk  /sys/power/state

Ah. Perhaps you should make it show reboot when you cat it?

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Hi.

On Mon, 2007-02-12 at 01:09 +0100, Rafael J. Wysocki wrote:
 On Monday, 12 February 2007 00:55, Nigel Cunningham wrote:
  Hi.
  
  On Mon, 2007-02-12 at 00:50 +0100, Rafael J. Wysocki wrote:
   On Monday, 12 February 2007 00:47, Nigel Cunningham wrote:
Hi.

On Mon, 2007-02-12 at 00:41 +0100, Rafael J. Wysocki wrote:
  I'm using M$ hibernation and Suspend2 to dual boot on our desktop 
  (dtv
  card that Linux doesn't support well yet), and I know other Suspend2
  users doing the same. It's made earier by the fact that Suspend2 
  lets
  you reboot instead of powering down.
 
 Well, I don't know why you're saying it's a special capability of 
 suspend2.
 Even the old swsusp has been able to do this since I can remember. 
 ;-)

It does?! I just did cat /sys/power/disk and it only says platform. How
do you make swsusp reboot instead of powering down?
   
   echo reboot  /sys/power/disk  echo disk  /sys/power/state
  
  Ah. Perhaps you should make it show reboot when you cat it?
 
 albercik:~ # echo reboot  /sys/power/disk
 albercik:~ # cat /sys/power/disk
 reboot
 
 It shows the current value, and platform happens to be the default now.

Oh, so the problem is that it shows the current value, not the
possibilities. I wrongly assumed it would work like /sys/power/disk.
That explains it :)

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-11 Thread Nigel Cunningham
Howdy!

On Mon, 2007-02-12 at 01:10 +0100, Tilman Schmidt wrote:
 Hi,
 
 Am 11.02.2007 23:37 schrieb Nigel Cunningham:
  On Sun, 2007-02-11 at 00:45 +0100, Tilman Schmidt wrote:
  Am 10.02.2007 23:37 schrieb Nigel Cunningham:
  If your device requires power management, and you know it requires power
  management, why not just implement power management? [...]
  Like it or not, power management is far from trivial, and people
  writing device drivers have limited resources. [...]
  It's not that complex. All we're really talking about is a bit of extra
  code to cleanup and configure hardware state; things that the driver
  author already knows how to do. S3 might require a bit more
  initialisation if firmware needs to be reloaded or more extensive
  configuration needs to be done, but if there's firmware to be loaded,
  there is a reasonably good probability that we loaded it from Linux to
  start with anyway.
 
 You are assuming a perfect world where driver authors have complete
 knowledge of their devices. In reality, many drivers (including
 those I have the mixed pleasure of maintaining) are based at least
 in part on reverse engineering, and managing power states may well
 fall into the domain of things not yet sufficiently reverse
 engineered.

Nope. I'm assuming that the driver author knows what needs to be done to
get the driver out of whatever state the BIOS puts it in to start with,
and into an operational state, and that they therefore also know what
needs to be done to take it out of the operational state again. I'm
admitting that there's also another state - the post suspend-to-ram
driver state - that they may not know how to deal with. But for
suspend-to-disk, if you know how to get the driver to work in the first
place, you know enough to stop it working (.suspend) and start it up
again (.resume) for the hibernate case at least.

I'm not assuming that you know enough to be able to put the driver into
a low state and get it out again. This is definitely preferable, and at
least possibly essential for suspend to ram, but for some unknown reason
I'm quite hibernation focused, and for that, just the above is
sufficient.

  Also, in your argument you neglected a few cases:
  - What if my device does not require power management?
  
  Then you as a generic routine that does nothing but return success
  (potentially shared with other drivers that are in the same situation).
 
 But if I just write an empty routine like that I open myself up to
 criticism along the lines of writing dummy routines just in order
 to shut up kernel warnings. BTDT.

Well, it might not be completely empty. I think someone already pointed
out that there's a minimal workset for the pci bus that pci drivers
would want to invoke. But we wouldn't (rightly) accuse you of such
things if we decided that the policy was Every driver ought to have a
resume routine, even if it's just a minimal I-just-work route.

  - What if I don't know whether my device requires power management?
  
  The questions are straight forward: Is there hardware state that needs
  to be configured if you've just booted the computer and nothing else has
  touched it? If so, that needs to be done in a resume method. Do you need
  to clean up state prior to doing the things in the resume method, or
  otherwise do things to quiesce the driver? If so, they will need to be
  done in the suspend method. The result will be roughly similar to what
  you do for module load/unload, except maybe less complete in some cases.
 
 I don't doubt your basic assessment. However it doesn't translate that
 easily into a real implementation. In my case, I maintain a USB driver,
 so I have to deal with USB specifics of suspend/resume which happen not
 to be that well documented. My driver provides an isdn4linux device but
 isdn4linux knows nothing about suspend/resume so I am on my own on how
 to reconcile the two. The device itself, though in turn far from trivial,
 is actually the least of my worries.

Mmm, so that's a case where we need to prod those who write
documentation and bus support first. You're probably closer! :)

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA-performance: Linux vs. FreeBSD

2007-02-12 Thread Nigel Cunningham
Hi Alan et al.

On Mon, 2007-02-12 at 19:08 +, Alan wrote:
 I'm not sure you'll get 50MB/sec sustained to work although you might
 with a good current drive used for nothing else, a linear stream of data
 (no seeking and file system overhead), and a non PCI controller (PCI
 Express, host chipset bus etc). 

That's Suspend2's usage pattern when given a whole partition, so I can
state without reservation you can get maximum throughput under those
circumstances, even with a PCI controller. Swsusp should do about the
same too.

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-12 Thread Nigel Cunningham
Hi.

On Mon, 2007-02-12 at 16:57 +0100, Geert Uytterhoeven wrote:
 On Mon, 12 Feb 2007, Pavel Machek wrote:
   Can't the upper layer just assume -ENOSYS if .resume/.suspend is NULL?
   It's nicer if you don't have to implement dummy functions at all.
  
  Unfortunately, drivers currently assume NULL == nothing is needed,
  so we'd have t do big search  replace... 
 
 Which means you also cannot easily keep track of which driver supports
 suspend/resume and which doesn't, as there will always be drivers where a
 missing suspend/resume function is correct.
 
 Wouldn't it be more sensible to have
 
 .suspend = suspend_nothing_to_do
 
 instead, and reserve NULL for `not yet implemented'?

Agreed.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-12 Thread Nigel Cunningham
Hi.

On Mon, 2007-02-12 at 22:01 +0100, Rafael J. Wysocki wrote:
 On Monday, 12 February 2007 21:58, Pavel Machek wrote:
  Hi!
  
 If all you need to do is say 'I don't need to do anything' and we 
 have a
 shared function that does that, all we're talking about doing is 
 adding
 to your struct pci_device (or whatever)
 
 .resume = generic_empty_resume;
 
 To me at least, that doesn't look awkward, and says cleanly and 
 clearly
 that you've checked things over and decided you know what's required.

Actually, I'd like it to be

.resume = generic_empty_resume; /* Explain, why your driver needs no
   resume */
   
   Okay, but we can't define an empty .resume(), because, for example, the 
   PCI's
   generic suspend/resume won't be called.
  
  PCI drivers should just do .resume = pci_generic_resume, explicitely.
 
 Well, I generally agree, but I think the idea with the pm_safe flag has some
 advantages.  For example, the drivers that do deffine .suspend() and .resume()
 which don't work correctly could be flagged as not pm_safe until the 
 problems
 are fixed.

Oooh. Now I like that idea. Are you thinking of a document in
Documentation/power that describes why pm_safe is off, or comments in
the code itself?

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-12 Thread Nigel Cunningham
Hi.

On Mon, 2007-02-12 at 06:19 +0100, Willy Tarreau wrote:
 One less myth as Nigel would say call it ;-)

You know me too well! :


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NAK new drivers without proper power management?

2007-02-12 Thread Nigel Cunningham
Hi.

On Mon, 2007-02-12 at 21:06 +0100, Rafael J. Wysocki wrote:
 On Monday, 12 February 2007 05:08, Nigel Cunningham wrote:
  Nope. I'm assuming that the driver author knows what needs to be done to
  get the driver out of whatever state the BIOS puts it in to start with,
  and into an operational state, and that they therefore also know what
  needs to be done to take it out of the operational state again. I'm
  admitting that there's also another state - the post suspend-to-ram
  driver state - that they may not know how to deal with. But for
  suspend-to-disk, if you know how to get the driver to work in the first
  place, you know enough to stop it working (.suspend) and start it up
  again (.resume) for the hibernate case at least.
 
 We're talking about _both_ the STR and STD.  The drivers that have problems
 with the STR cannot be regarded as suspend/resume-safe IMO.

Yeah, I'm not disagreeing at all. I'm just admitting my bias toward the
bit I concentrate on more.

[...]

  Mmm, so that's a case where we need to prod those who write
  documentation and bus support first. You're probably closer! :)
 
 Actually, the lack of documentation is a major problem that we all should
 try to fix in the first place.  Unfortunately the code has been recently
 changing quite often, so that's difficult.

Yeah.

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] PM: Document requirements for basic PM support in drivers

2007-02-12 Thread Nigel Cunningham
Hi.

On Tue, 2007-02-13 at 00:23 +0100, Rafael J. Wysocki wrote:
 Hi,
 
 Here's my attempt to document the requirements with respect to the basic PM
 support in drivers and the testing of that.  Comments welcome.
 
 Greetings,
 Rafael
 
 ---
  Documentation/SubmittingDrivers |   10 ++
  Documentation/power/drivers-testing.txt |  119 
 
  2 files changed, 129 insertions(+)
 
 Index: linux-2.6.20-git4/Documentation/SubmittingDrivers
 ===
 --- linux-2.6.20-git4.orig/Documentation/SubmittingDrivers
 +++ linux-2.6.20-git4/Documentation/SubmittingDrivers
 @@ -87,6 +87,16 @@ Clarity:   It helps if anyone can see how 
   driver that intentionally obfuscates how the hardware works
   it will go in the bitbucket.
  
 +PM support:  Since Linux is used on many portable and desktop systems, your
 + driver is likely to be used on such a system and therefore it
 + should support basic power management by implementing, if
 + necessary, the .suspend and .resume methods used during the
 + system-wide suspend and resume transitions.  You should verify
 + that your driver correctly handles the suspend and resume, but
 + if you are unable to ensure that, please at least define the
 + .suspend method returning the -ENOSYS (Function not
 + implemented) error.
 +
  Control: In general if there is active maintainance of a driver by
   the author then patches will be redirected to them unless
   they are totally obvious and without need of checking.
 Index: linux-2.6.20-git4/Documentation/power/drivers-testing.txt
 ===
 --- /dev/null
 +++ linux-2.6.20-git4/Documentation/power/drivers-testing.txt
 @@ -0,0 +1,119 @@
 +Testing suspend and resume support in drivers
 + (C) 2007 Rafael J. Wysocki [EMAIL PROTECTED]
 +
 +Unfortunately, to effectively test the support for the system-wide suspend 
 and
 +resume transitions in a driver, it is necessary to suspend and resume a fully
 +functional system with this driver loaded.  Moreover, that should be done 
 many
 +times, preferably many times in a row, and separately for the suspend to disk
 +(STD) and the suspend to RAM (STR) transitions, because each of these cases
 +involves different ordering of operations and different interactions with the
 +machine's BIOS.
 +
 +Of course, for this purpose the test system has to be known to suspend and
 +resume without the driver being tested.  Thus, if possible, you should first
 +resolve all suspend/resume-related problems in the test system before you 
 start
 +testing the new driver.
 +
 +I. Preparing the test system
 +
 +1. To verify that the STD works, you can try to suspend in the reboot mode:
 +
 +# echo reboot  /sys/power/disk
 +# echo disk  /sys/power/state
 +
 +and the system should suspend, reboot, resume and get back to the command 
 prompt
 +where you have started the transition.  If that happens, the STD is most 
 likely
 +to work correctly, but you can repeat the test a couple of times in a row for
 +confidence.  You should also test the platform and shutdown modes of

I would say you need to repeat the test at least a couple of times...,
perhaps adding something along the lines of This is necessary because
some problems only show up on a second attempt at suspending and
resuming a driver. You can think of it as the driver coming back 'dazed
and confused' after the first cycle, and only being properly killed by
the second attempt.

 +suspend:
 +
 +# echo platform  /sys/power/disk
 +# echo disk  /sys/power/state
 +
 +or
 +
 +# echo shutdown  /sys/power/disk
 +# echo disk  /sys/power/state
 +
 +in which cases you will have to press the power button to make the system
 +resume.  If that works, you are ready to test the STD with the new driver
 +loaded.  Otherwise, you have to identify what is wrong.
 +
 +a) To verify if there are any drivers that cause problems you can run the STD
 +in the test mode:
 +
 +# echo test  /sys/power/disk
 +# echo disk  /sys/power/state
 +
 +in which case the system should freeze tasks, suspend devices, disable 
 nonboot
 +CPUs (if any), wait for 5 seconds, enable nonboot CPUs, resume devices, thaw
 +tasks and return to your command prompt.  If that fails, most likely there is
 +a driver that fails to either suspend or resume (in the latter case the 
 system
 +may hang or be unstable after the test, so please take that into 
 consideration).
 +To find this driver, you can carry out a binary search according to the 
 rules:
 +- if the test fails, unload a half of the drivers currently loaded and repeat
 +(that would probably involve rebooting the system, so always note what 
 drivers
 +have been loaded before the test),
 +- if the test succeeds, load a half of the drivers you have unloaded most
 

Re: [linux-pm] 2.6.21-rc4-mm1: freezing of processes broken

2007-03-20 Thread Nigel Cunningham
Hi.

On Tue, 2007-03-20 at 19:23 -0600, Eric W. Biederman wrote:
 Rafael J. Wysocki [EMAIL PROTECTED] writes:
 
  On Tuesday, 20 March 2007 22:06, Rafael J. Wysocki wrote:
  On Tuesday, 20 March 2007 21:58, Jiri Slaby wrote:
   Rafael J. Wysocki napsal(a):
Actually, the problem is 100% reproducible on my system too and I doubt
  it's
caused by the recent freezer patches.
   
   I don't know what exactly do you mean by recent, but 2.6.21-rc3-mm2 works
   for me.
  
  Thanks for the confirmation.
  
  The patches I was talking about had already been in 2.6.21-rc3-mm2, so the
  reason of this failure must be different.
 
  Bisection shows that the freezing of processes has been broken by one of the
  patches:
 
  remove-the-likelypid-check-in-copy_process.patch
 
 Grr.  Oleg's review of remove-the-likelypid-check-in-copy-process
 showed it to be questionable (and it was just an optimization)
 so we can get rid of that one easily. 
 
 Although all it did that was really questionable was add
 the idle process to the global process list and bump a process
 count when we forked the idle process.  Not dramatically dangerous
 things.
 
  use-task_pgrp-task_session-in-copy_process.patch
 
 As I recall that patch was pretty trivial, and shouldn't have
 anything to do with the freezer.   The process freezer doesn't care
 about pids does it?

Could the freezer code be trying to freeze the idle thread as a result?

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] Code reordering in swsusp breaks suspend on SMP systems

2007-03-21 Thread Nigel Cunningham
Hi.

On Wed, 2007-03-21 at 18:40 +0200, Maxim Levitsky wrote:
 Hi,
 
 Starting with 2.6.21-rc1 suspend to ram and disk doesn't work anymore on my 
 system.
 
 I did a git-bisect and found that those commits break it:
 
 e3c7db621bed4afb8e231cb005057f2feb5db557 - [PATCH] [PATCH] PM: Change code 
 ordering in main.c
 ed746e3b18f4df18afa3763155972c5835f284c5 - [PATCH] [PATCH] swsusp: Change 
 code ordering in disk.c
 259130526c267550bc365d3015917d90667732f1 - [PATCH] [PATCH] swsusp: Change 
 code ordering in user.c
 
 I already reported about it, but now i know the reason why suspend breaks.
 
 The problem is that both cpu_up/cpu_down were allowed to sleep until now, 
 and it did work because those functions could be called only in process 
 context
 (the one that writes to /sys/devices/system/cpu/cpu*/online) or  idle thread  
 that does smp_init()).
 
 But now they are called _after_ all tasks were suspended, so if cpu_down 
 tries for example to take a lock
 that is taken by different process, it can't since the different proccess is 
 frozen and can't release the lock.
 
 I tested this and all results are positive:
 
 I disabled 2nd cpu by hand, and then suspend to ram was successfull.
 Suspend to disk went correctly, but it hang on resume, and I know why.
 It hang in old kernel trying to disable 2nd cpu that was enabled by it.
 
 I was able using kdb to confirm that this is true because it was still 
 possible to enter kdb, and see that
 idle thread (swapper) was active, and uswsusp was waiting on mutex inside 
 workqueue_cpu_callback.
 
 The solution for this problem seems to be ether complete audit of code that 
 uses register_cpu_notifier,
 to ensure that it doesn't sleep. 
 Also documentation should be changed to note about it.
 
 Or, it is also possible to revert this change.

Do you know exactly which mutex was being waited on and where it was
taken? If you can say that, it would be much more helpful.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] Code reordering in swsusp breaks suspend on SMP systems

2007-03-21 Thread Nigel Cunningham
Hi.

On Wed, 2007-03-21 at 22:38 +0100, Rafael J. Wysocki wrote:
  Do you know exactly which mutex was being waited on and where it was
  taken? If you can say that, it would be much more helpful.

Yeah, me too, but assuming too much sometimes bites me :)

 I think this is the XFS problem with freezable workqueues.
 
 Maxim, please try to apply the appended patch and see if it helps.

Thanks for your subsequent messages, Maxim. Could you confirm for us
that the patch Rafael attached fixes it?

Regards,

Nigel

 ---
 Since freezable workqueues are broken in 2.6.21-rc
 (cf. http://marc.theaimsgroup.com/?l=linux-kernelm=116855740612755,
 http://marc.theaimsgroup.com/?l=linux-kernelm=117261312523921w=2)
 it's better to remove them altogether for 2.6.21 and change the only user of
 them (XFS) accordingly.
 
 ---
  fs/xfs/linux-2.6/xfs_buf.c |4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)
 
 Index: linux-2.6.21-rc4/fs/xfs/linux-2.6/xfs_buf.c
 ===
 --- linux-2.6.21-rc4.orig/fs/xfs/linux-2.6/xfs_buf.c
 +++ linux-2.6.21-rc4/fs/xfs/linux-2.6/xfs_buf.c
 @@ -1829,11 +1829,11 @@ xfs_buf_init(void)
   if (!xfs_buf_zone)
   goto out_free_trace_buf;
  
 - xfslogd_workqueue = create_freezeable_workqueue(xfslogd);
 + xfslogd_workqueue = create_workqueue(xfslogd);
   if (!xfslogd_workqueue)
   goto out_free_buf_zone;
  
 - xfsdatad_workqueue = create_freezeable_workqueue(xfsdatad);
 + xfsdatad_workqueue = create_workqueue(xfsdatad);
   if (!xfsdatad_workqueue)
   goto out_destroy_xfslogd_workqueue;
  

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/8] Enhance process freezer interface for usage beyond software suspend

2007-04-06 Thread Nigel Cunningham
Hi.

On Fri, 2007-04-06 at 16:34 +0200, Rafael J. Wysocki wrote:
 On Monday, 2 April 2007 22:51, Pavel Machek wrote:
  Hi!
  
 +/* Per process freezer specific flags */
 +#define PF_FE_SUSPEND0x8000  /* This thread should 
 not be frozen
 +  * for suspend
 +  */
 +
 +#define PF_FE_KPROBES0x0010  /* This thread should 
 not be frozen
 +  * for Kprobes
 +  */

Just put the comment before the define for long comments?
   
   Agreed.
  
  (Actually it would be nice to say
  
  /* This thread should not be frozen for suspend, becuase it is needed
 for getting image saved to disk */
  
 -#ifdef CONFIG_PM
 +#if defined(CONFIG_PM) || defined(CONFIG_HOTPLUG_CPU) || \
 + defined(CONFIG_KPROBES)

Should we create CONFIG_FREEZER?
   
   Why do you think so?  I think the freezer should be compiled automatically
   if any of the above is set, which is what this directive really means.
  
  Kconfig can do that. (select statement). If we have one such ifdef,
  it is okay, but if it would be more of them.
  
Hmmm, I do not really like softlockup watchdog running during suspend.
Can we make this freezeable and make watchdog shut itself off while
suspending?
   
   Generally, I agree, but this patch only replaces the existing instances
   of PF_NOFREEZE with the new mechanism.  The changes you're talking about
   require a separate patch series (or at least one separate patch), I 
   think, and
   they need not be so simple to make.
  
  Agreed about separate patch series.
  
 - current-flags |= PF_NOFREEZE;
 + freezer_exempt(FE_ALL);
   pid = kernel_thread(do_linuxrc, /linuxrc, SIGCHLD);
   if (pid  0) {
   while (pid != sys_wait4(-1, NULL, 0, NULL))

Does this mean we have userland /linuxrc running with PF_NOFREEZE?
That would be very bad...
   
   No, actually it is _required_ for the userland resume to work.  Well, 
   perhaps
   I should place a comment in there so that I don't have to explain this 
   again
   and again. :-)
  
  Can you put big bold comment there?
 
  Why is it needed? Freezer never freezes _current_ task.
 
 No, it doesn't, but this task spawns linuxrc and then calls sys_wait4() in a
 loop.
 
 Well, actually, I'll try to plant try_to_freeze() in this loop and see if that
 works.  If it doesn't, I'll add a comment.

It works. I've had:

while (pid != sys_wait4(-1, NULL, 0, NULL)) {
yield();
try_to_freeze();
}

there for ages for Suspend2.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/8] Use process freezer for cpu-hotplug

2007-04-06 Thread Nigel Cunningham
Hi.

On Fri, 2007-04-06 at 12:47 -0500, Nathan Lynch wrote:
 Ingo Molnar wrote:
  
  * Nathan Lynch [EMAIL PROTECTED] wrote:
  
-   raw_notifier_call_chain(cpu_chain, CPU_LOCK_ACQUIRE, hcpu);
+   if (freeze_processes(FE_HOTPLUG_CPU)) {
+   thaw_processes(FE_HOTPLUG_CPU);
+   return -EBUSY;
+   }
+
   
   If I'm understanding correctly, this will cause
   
   # echo 0  /sys/devices/system/cpu/cpuX/online
   
   to sometimes fail, and userspace is expected to try again?  This will 
   break existing applications.
   
   Perhaps drivers/base/cpu.c:store_online should retry as long as 
   cpu_up/down return -EBUSY.  That would avoid a userspace-visible 
   interface change.
  
  yeah. I'd even suggest a freeze_processes_nofail() API instead, that 
  does this internally, without burdening the callsites. (and once the 
  freezer becomes complete then freeze_processes_nofail() == 
  freeze_processes())
 
 Yeah, I just realized that an implementation of my proposal would busy
 loop in the kernel forever if a silly admin tried to offline the last
 cpu (we're already using -EBUSY for that case), so
 freeze_processes_nofail is a better idea :-)

If there's only one online cpu, shouldn't it return -EINVAL?

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: USB: on suspend to ram/disk all usb devices are replugged

2007-04-06 Thread Nigel Cunningham
Hi.

On Mon, 2007-04-02 at 21:36 +0200, Pavel Machek wrote:
 Hi!
 
   But you're still likely to run into trouble if you unplug a storage
   device, move it to another system and write on it, then plug it back into
   the original system.  The PLVM would somehow have to recognize that the
   data had been changed.  I don't know a foolproof way of doing that.
   
  
  Mark the filesystem as in-use with a one-time UUID in the superblock at
  mount time. If one moved the drive to another system it would require
  an fsck to clear the UUID before the other system could use it; then
  the original machine would refuse to use the drive when the UUID didn't
  match on resume.
 
 You still need fs-specific code, I'm afraid... plus userland tool
 to reset signatures back.

You don't need userland to reset the signatures. More kernel code, sure.
But it doesn't _need_ to be userland.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/8] Enhance process freezer interface for usage beyond software suspend

2007-04-07 Thread Nigel Cunningham
Hi.

On Sat, 2007-04-07 at 11:33 +0200, Rafael J. Wysocki wrote:
 On Saturday, 7 April 2007 00:20, Nigel Cunningham wrote:
   - current-flags |= PF_NOFREEZE;
   + freezer_exempt(FE_ALL);
 pid = kernel_thread(do_linuxrc, /linuxrc, SIGCHLD);
 if (pid  0) {
 while (pid != sys_wait4(-1, NULL, 0, NULL))
  
  Does this mean we have userland /linuxrc running with PF_NOFREEZE?
  That would be very bad...
 
 No, actually it is _required_ for the userland resume to work.  Well, 
 perhaps
 I should place a comment in there so that I don't have to explain 
 this again
 and again. :-)

Can you put big bold comment there?
   
Why is it needed? Freezer never freezes _current_ task.
   
   No, it doesn't, but this task spawns linuxrc and then calls sys_wait4() 
   in a
   loop.
   
   Well, actually, I'll try to plant try_to_freeze() in this loop and see if 
   that
   works.  If it doesn't, I'll add a comment.
  
  It works. I've had:
  
  while (pid != sys_wait4(-1, NULL, 0, NULL)) {
  yield();
  try_to_freeze();
  }
  
  there for ages for Suspend2.
 
 OK, thanks.  Is there any particular reason to place try_to_freeze() after
 yield()?

Not that I remember. I haven't touched that for years :)

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm] freezer: Remove PF_NOFREEZE from handle_initrd

2007-04-07 Thread Nigel Cunningham
Hi again.

By the way, I'm stopping using [EMAIL PROTECTED]; could you
please change your address book to nigel at nigel dot suspend2 dot net?

Thanks!

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm] freezer: Remove PF_NOFREEZE from handle_initrd

2007-04-07 Thread Nigel Cunningham
Hi.

On Sat, 2007-04-07 at 18:14 +0200, Rafael J. Wysocki wrote:
 From: Rafael J. Wysocki [EMAIL PROTECTED]
 
 Make handle_initrd() call try_to_freeze() in a suitable place instead of 
 setting
 PF_NOFREEZE for the current task.
 
 Signed-off-by: Rafael J. Wysocki [EMAIL PROTECTED]
 ---
  init/do_mounts_initrd.c |5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)
 
 Index: linux-2.6.21-rc6/init/do_mounts_initrd.c
 ===
 --- linux-2.6.21-rc6.orig/init/do_mounts_initrd.c
 +++ linux-2.6.21-rc6/init/do_mounts_initrd.c
 @@ -55,11 +55,12 @@ static void __init handle_initrd(void)
   sys_mount(., /, NULL, MS_MOVE, NULL);
   sys_chroot(.);
  
 - current-flags |= PF_NOFREEZE;
   pid = kernel_thread(do_linuxrc, /linuxrc, SIGCHLD);
   if (pid  0) {
 - while (pid != sys_wait4(-1, NULL, 0, NULL))
 + while (pid != sys_wait4(-1, NULL, 0, NULL)) {
 + try_to_freeze();
   yield();
 + }
   }
  
   /* move initrd to rootfs' /old */

ACK.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH -mm] swsusp: Use rbtree for tracking allocated swap

2007-04-07 Thread Nigel Cunningham
Hi.

On Sat, 2007-04-07 at 15:06 -0700, Andrew Morton wrote:
 On Sat, 7 Apr 2007 23:20:39 +0200 Rafael J. Wysocki [EMAIL PROTECTED] 
 wrote:
 
  This should allow us to reduce the memory usage, practically always, and
  improve performance.
 
 And does it?

It will. I've been using extents for ages, for the same reasons. I don't
put them in an rb_tree because I view it as less than most efficient,
but it will still be a huge step forward from bitmaps in the normal
case.

The worst case would be if every second page of swap was in use, so that
you needed one extent per swap page. In that case, it would use more
memory than the bitmap, but far, far more common will be the case where
only one extent is needed for the whole swap partition, because the
algorithm used by the swap allocator minimises fragmentation.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH -mm] swsusp: Use rbtree for tracking allocated swap

2007-04-07 Thread Nigel Cunningham
Hi.

On Sun, 2007-04-08 at 01:13 +0200, Rafael J. Wysocki wrote:
 On Sunday, 8 April 2007 00:31, Nigel Cunningham wrote:
  Hi.
  
  On Sat, 2007-04-07 at 15:06 -0700, Andrew Morton wrote:
   On Sat, 7 Apr 2007 23:20:39 +0200 Rafael J. Wysocki [EMAIL PROTECTED] 
   wrote:
   
This should allow us to reduce the memory usage, practically always, and
improve performance.
   
   And does it?
 
 Yes.  There are theoretical corner cases in which it may be less efficient
 than the current approach, but in the usual situation it is _much_ better.
 
  It will. I've been using extents for ages, for the same reasons. I don't
  put them in an rb_tree because I view it as less than most efficient,
 
 Actually, I don't agree with that.  In the normal situation (ie. one extent is
 needed) there is no difference as far as the memory usage or performance
 are concerned, but if there are more extents, the rbtree should be more
 efficient.

I don't think it's worth having a big discussion over, but let me give
you the details, which you can then feel free to ignore :)

The rb_node struct adds an unsigned long and two struct rb_node *
pointers. My extents use one struct extent * pointer. The difference is
thus 12/24 bytes per extent (32/64 bits) vs 20/40. In the normal
situation, not worth worrying about, but I'm also using these for
recording the sectors we write too, and thinking about swap files and
multiple swap devices. Nearly double the memory use bites more as you
get more extents.

Insertion cost for rb_node includes keeping the tree balanced. For
extents, I start with the location of the last insertion to minimise the
cost, so insertion time is usually virtually zero (inc max of last
extent or append a new one). If for some reason swap was allocated out
of order, I might need to traverse the whole chain from the start.

Normal usage in both cases is simply iterating through the list, so I
guess the cost would be approximately the same.

Deletion could would include rebalancing for the rb_nodes.

Code cost is a gain for you - you're leveraging existing code, I'm
adding a bit more. extent.c is 300 lines including code for serialising
the chains in an image header and iterating through a group of chains
(multiple swap devices support).

rb_nodes seem to be the wrong solution to me because we generally don't
care about searching. We care about minimising memory usage and
maximising the speed of iteration, insertion and deletion. I believe
I've managed to do that with a singly linked, sorted list.

That said, we've agreed that we're normally talking about a small number
of extents, so it's probably not worth the bandwidth I've already
spent :)

Regards,

Nigel


signature.asc
Description: This is a digitally signed message part


Re: [RFC][PATCH -mm] swsusp: Use rbtree for tracking allocated swap

2007-04-08 Thread Nigel Cunningham
Hi.

On Sun, 2007-04-08 at 18:47 +0200, Rafael J. Wysocki wrote:
 On Sunday, 8 April 2007 01:42, Nigel Cunningham wrote:
  Hi.
  
  On Sun, 2007-04-08 at 01:13 +0200, Rafael J. Wysocki wrote:
   On Sunday, 8 April 2007 00:31, Nigel Cunningham wrote:
Hi.

On Sat, 2007-04-07 at 15:06 -0700, Andrew Morton wrote:
 On Sat, 7 Apr 2007 23:20:39 +0200 Rafael J. Wysocki [EMAIL 
 PROTECTED] wrote:
 
  This should allow us to reduce the memory usage, practically 
  always, and
  improve performance.
 
 And does it?
   
   Yes.  There are theoretical corner cases in which it may be less efficient
   than the current approach, but in the usual situation it is _much_ better.
   
It will. I've been using extents for ages, for the same reasons. I don't
put them in an rb_tree because I view it as less than most efficient,
   
   Actually, I don't agree with that.  In the normal situation (ie. one 
   extent is
   needed) there is no difference as far as the memory usage or performance
   are concerned, but if there are more extents, the rbtree should be more
   efficient.
  
  I don't think it's worth having a big discussion over, but let me give
  you the details, which you can then feel free to ignore :)
  
  The rb_node struct adds an unsigned long and two struct rb_node *
  pointers. My extents use one struct extent * pointer. The difference is
  thus 12/24 bytes per extent (32/64 bits) vs 20/40.
 
 Well, you use open-coded lists.  If you used list.h lists, the numbers 
 would be different. :-)

Yes, but I don't need doubly linked lists.

  In the normal situation, not worth worrying about, but I'm also using these 
  for
  recording the sectors we write too, and thinking about swap files and
  multiple swap devices. Nearly double the memory use bites more as you
  get more extents.
 
  Insertion cost for rb_node includes keeping the tree balanced. For
  extents, I start with the location of the last insertion to minimise the
  cost, so insertion time is usually virtually zero (inc max of last
  extent or append a new one).
 
 Isn't the appending one actually linear worst-case?

Worst case would be the swap allocator returning swap pages in reverse
order. As you and I both know, that doesn't happen. I first implemented
this in 2003. If the worst case actually happened, I would have seen the
effect by now :)

  If for some reason swap was allocated out of order, I might need to traverse
  the whole chain from the start. 
 
 Exactly.
 
  Normal usage in both cases is simply iterating through the list, so I
  guess the cost would be approximately the same.
  
  Deletion could would include rebalancing for the rb_nodes.
 
 In swsusp the deletions are needed only if there's an error.

When freeing swap at the end of the cycle?

  Code cost is a gain for you - you're leveraging existing code, I'm
  adding a bit more. extent.c is 300 lines including code for serialising
  the chains in an image header and iterating through a group of chains
  (multiple swap devices support).
  
  rb_nodes seem to be the wrong solution to me because we generally don't
  care about searching. We care about minimising memory usage and
  maximising the speed of iteration, insertion and deletion. I believe
  I've managed to do that with a singly linked, sorted list.
 
 The insertion also uses searching and in fact I don't really care for anything
 else.

Ok :)

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH -mm] swsusp: Use rbtree for tracking allocated swap

2007-04-09 Thread Nigel Cunningham
Hi.

On Mon, 2007-04-09 at 15:03 +0200, Rafael J. Wysocki wrote:
 On Sunday, 8 April 2007 23:07, Nigel Cunningham wrote:
 [--snip--]
Normal usage in both cases is simply iterating through the list, so I
guess the cost would be approximately the same.

Deletion could would include rebalancing for the rb_nodes.
   
   In swsusp the deletions are needed only if there's an error.
  
  When freeing swap at the end of the cycle?
 
 That depends on what you mean by 'the end'. :-)
 
 We free swap if the image saving fails only, since it's allocated after we've
 created the image.  After the resume, the state of swap from before the image
 creation is the current one anyway.

Ah, of course. I forgot that temporarily.

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: mconf not removed by make mrproper

2007-04-11 Thread Nigel Cunningham
Hi.

On Sun, 2007-04-01 at 23:17 +0200, Sam Ravnborg wrote:
 On Thu, Feb 01, 2007 at 02:05:49PM +1100, Nigel Cunningham wrote:
  Hi.
  
  The scripts/kconfig/mconf target isn't removed by the make mrproper
  target. I can see a couple of possibilities, but wasn't sure which you'd
  prefer, so thought I'd just raise the issue.
  
  It's only an issue for me because my patch generation script relies on
  make mrproper making a properly clean tree.
 
 Fixed - thanks.
 
   Sam

Works fine here; thanks!

Acked-by: Nigel Cunningham [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-rc5: swsusp: Not enough free memory

2007-04-13 Thread Nigel Cunningham
Hi.

On Fri, 2007-04-13 at 14:00 +0200, Rafael J. Wysocki wrote:
  
  Shrinking memory...  Pages needed: 128103 normal, 0 highmem
  Pages needed: 125226 normal, 0 highmem
  Pages needed: -5757 normal, 0 highmem
  Pages needed: -5757 normal, 0 highmem
  Pages needed: -5757 normal, 0 highmem
  Pages needed: -5757
  Pages needed: 127953 normal, 0 highmem
  Pages needed: 125076 normal, 0 highmem
  Pages needed: -6043 normal, 0 highmem
  Pages needed: -6043 normal, 0 highmem
  Pages needed: -6043 normal, 0 highmem
  Pages needed: -6043
  done (200 pages freed)
  Freed 800 kbytes in 0.16 seconds (5.00 MB/s)
  Suspending console(s)
  ...
  CPU1 is down
  swsusp: critical section:
  swsusp: Need to copy 131358 pages
  swsusp: Normal pages needed: 131358
  swsusp: Normal pages needed: 131358 + 1024 + 22, available pages: 130607
 
 Well, it looks like someone allocated about 6000 pages after we had freed
 enough memory for suspending.

We have a tunable allowance in Suspend2 for this, because fglrx
allocates a lot of pages in its suspend routine if DRI is enabled. I
think some other drivers do too, but fglrx is the main one I know.

Nigel



signature.asc
Description: This is a digitally signed message part


Re: [RFD] swsusp problem: Drivers allocate much memory during suspend (was: Re: 2.6.21-rc5: swsusp: Not enough free memory)

2007-04-13 Thread Nigel Cunningham
Hi.

On Fri, 2007-04-13 at 22:41 +0200, Rafael J. Wysocki wrote:
 On Friday, 13 April 2007 14:21, Nigel Cunningham wrote:
  Hi.
  
  On Fri, 2007-04-13 at 14:00 +0200, Rafael J. Wysocki wrote:

Shrinking memory...  Pages needed: 128103 normal, 0 highmem
Pages needed: 125226 normal, 0 highmem
Pages needed: -5757 normal, 0 highmem
Pages needed: -5757 normal, 0 highmem
Pages needed: -5757 normal, 0 highmem
Pages needed: -5757
Pages needed: 127953 normal, 0 highmem
Pages needed: 125076 normal, 0 highmem
Pages needed: -6043 normal, 0 highmem
Pages needed: -6043 normal, 0 highmem
Pages needed: -6043 normal, 0 highmem
Pages needed: -6043
done (200 pages freed)
Freed 800 kbytes in 0.16 seconds (5.00 MB/s)
Suspending console(s)
...
CPU1 is down
swsusp: critical section:
swsusp: Need to copy 131358 pages
swsusp: Normal pages needed: 131358
swsusp: Normal pages needed: 131358 + 1024 + 22, available pages: 130607
   
   Well, it looks like someone allocated about 6000 pages after we had freed
   enough memory for suspending.
  
  We have a tunable allowance in Suspend2 for this, because fglrx
  allocates a lot of pages in its suspend routine if DRI is enabled. I
  think some other drivers do too, but fglrx is the main one I know.
 
 I wasn't aware of that, thanks for the information.
 
 I think this means we'll probably need to add a tunable, similar to 
 image_size,
 that will allow the users to specify how much spare memory they want to 
 reserve
 for suspending (instead of the constant PAGES_FOR_IO).  IMO we can call it
 'spare_memory'.
 
 Still, this doesn't look like a real solution, because it would require the
 users affected by this problem to experiment with suspending in order to
 figure out how much spare memory they will need.
 
 IMO to really fix the problem, we should let the drivers that need much memory
 for suspending allocate it _before_ the memory shrinker is called.  For this
 purpose we can use notifiers that will be called before we start the shrinking
 of memory.  Namely, if a driver needs to allocate substantial amount of memory
 for suspending, it can register a notifier that will be called before we try 
 to
 shrink memory.  Then, the memory needed by the driver may be allocated in
 this notifier (of course, in that case it will also have to be called if the
 shrinking of memory fails, so that the memory allocated by the driver for
 suspending can be freed) and used in the driver's .suspend() routine.
 
 Comments welcome.

Yeah. I've thought about it too. It could also be good for that acpi
routine that was allocating memory during in an atomic context with the
wrong flagas. Another idea that occurred to me would be to allow drivers
to have a routine saying how much memory they will need, which we could
call to calculate the allowance we need. Personally, I think the
notifier chain is simpler and preferable :)

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] swsusp problem: Drivers allocate much memory during suspend (was: Re: 2.6.21-rc5: swsusp: Not enough free memory)

2007-04-13 Thread Nigel Cunningham
Hi.

On Sat, 2007-04-14 at 00:10 +0200, Pavel Machek wrote:
 Hi!
 
Well, it looks like someone allocated about 6000 pages after we had 
freed
enough memory for suspending.
   
   We have a tunable allowance in Suspend2 for this, because fglrx
   allocates a lot of pages in its suspend routine if DRI is enabled. I
   think some other drivers do too, but fglrx is the main one I know.
  
  I wasn't aware of that, thanks for the information.
  
  I think this means we'll probably need to add a tunable, similar to 
  image_size,
  that will allow the users to specify how much spare memory they want to 
  reserve
  for suspending (instead of the constant PAGES_FOR_IO).  IMO we can call it
  'spare_memory'.
 
 Just increase PAGES_FOR_IO. This should not be tunable.

If we don't have a means for drivers to pre-allocate or say how much
memory they need, it should be tunable. Frankly, I'm startled that you
guys haven't heard of this issue before now. I can't believe everyone
who has ever wanted to hibernate with DRM enabled has been using
Suspend2. Maybe this is one of the sources of complaints that swsusp
isn't reliable?

  IMO to really fix the problem, we should let the drivers that need much 
  memory
  for suspending allocate it _before_ the memory shrinker is called.  For this
  purpose we can use notifiers that will be called before we start the 
  shrinking
  of memory.  Namely, if a driver needs to allocate substantial amount
  of memory
 
 Yes please. Using that notifier without leaking the memory will be
 interesting but if someone needs so much memory during suspend, let
 them eat their own complexity.

It doesn't need to be that complex. Add another (optional) function to
the driver model to let drivers say how much they want and it becomes
trivial. Maybe this idea should be preferred over the notifier chain.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] swsusp problem: Drivers allocate much memory during suspend (was: Re: 2.6.21-rc5: swsusp: Not enough free memory)

2007-04-13 Thread Nigel Cunningham
Hi.

On Sat, 2007-04-14 at 00:35 +0200, Rafael J. Wysocki wrote:
 On Saturday, 14 April 2007 00:10, Pavel Machek wrote:
  Hi!
  
 Well, it looks like someone allocated about 6000 pages after we had 
 freed
 enough memory for suspending.

We have a tunable allowance in Suspend2 for this, because fglrx
allocates a lot of pages in its suspend routine if DRI is enabled. I
think some other drivers do too, but fglrx is the main one I know.
   
   I wasn't aware of that, thanks for the information.
   
   I think this means we'll probably need to add a tunable, similar to 
   image_size,
   that will allow the users to specify how much spare memory they want to 
   reserve
   for suspending (instead of the constant PAGES_FOR_IO).  IMO we can call it
   'spare_memory'.
  
  Just increase PAGES_FOR_IO. This should not be tunable.
 
 Well, I'm not sure.  First, we don't really know what the value of it should 
 be
 and this alone is a good enough reason for making it tunable, IMHO.  Second, I
 think different systems may need different PAGES_FOR_IO and taking just the
 maximum (even if we learn how much that actually is) seems to be wasteful in
 the vast majority of cases.  Finally, I think it may be possible to speed up
 image saving by increasing PAGES_FOR_IO without playing with the
 image size and we can let the user try it (think of distro kernels that are
 compiled for many different users).

It does vary according to the amount of video memory used for DRM, if I
understand correctly.

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] [RFD] swsusp problem: Drivers allocate much memory during suspend (was: Re: 2.6.21-rc5: swsusp: Not enough free memory)

2007-04-13 Thread Nigel Cunningham
Hi.

On Sat, 2007-04-14 at 00:38 +0200, Pavel Machek wrote:
 Hi!
 
  Well, it looks like someone allocated about 6000 pages after we had 
  freed
  enough memory for suspending.
 
 We have a tunable allowance in Suspend2 for this, because fglrx
 allocates a lot of pages in its suspend routine if DRI is enabled. I
 think some other drivers do too, but fglrx is the main one I know.

I wasn't aware of that, thanks for the information.

I think this means we'll probably need to add a tunable, similar to 
image_size,
that will allow the users to specify how much spare memory they want to 
reserve
for suspending (instead of the constant PAGES_FOR_IO).  IMO we can call 
it
'spare_memory'.
   
   Just increase PAGES_FOR_IO. This should not be tunable.
  
  If we don't have a means for drivers to pre-allocate or say how much
  memory they need, it should be tunable. Frankly, I'm startled that you
  guys haven't heard of this issue before now. I can't believe everyone
  who has ever wanted to hibernate with DRM enabled has been using
  Suspend2. Maybe this is one of the sources of complaints that swsusp
  isn't reliable?
 
 We do not support closed-source drivers, and open-source drivers are
 well behaved.

I didn't say fglrx was the only example. Any system using DRI (not DRM,
sorry), would, I think, be expected. I just mention fglrx because I have
a Radeon 200M that can only use fglrx for Beryl etc at the mo - it's the
one I'm familiar with.

IMO to really fix the problem, we should let the drivers that need much 
memory
for suspending allocate it _before_ the memory shrinker is called.  For 
this
purpose we can use notifiers that will be called before we start the 
shrinking
of memory.  Namely, if a driver needs to allocate substantial amount
of memory
   
   Yes please. Using that notifier without leaking the memory will be
   interesting but if someone needs so much memory during suspend, let
   them eat their own complexity.
  
  It doesn't need to be that complex. Add another (optional) function to
  the driver model to let drivers say how much they want and it becomes
  trivial. Maybe this idea should be preferred over the notifier chain.
 
 Actually, it is trivial to prealocate during boot ;-). As the notifier
 chain can be useful for other stuff, too, I'd go that way.

Pavel! Talk sense! You're not seriously suggesting squirreling away 35
megabytes of a user's memory at boot just because they might want to
hibernate with DRI enabled later? Yes, 35 megabytes is a realistic
amount.

Regards,

Nigel


signature.asc
Description: This is a digitally signed message part


Re: [RFD] swsusp problem: Drivers allocate much memory during suspend (was: Re: 2.6.21-rc5: swsusp: Not enough free memory)

2007-04-13 Thread Nigel Cunningham
Hi.

On Sat, 2007-04-14 at 00:40 +0200, Pavel Machek wrote:
 Hi!
 
  Well, it looks like someone allocated about 6000 pages after we had 
  freed
  enough memory for suspending.
 
 We have a tunable allowance in Suspend2 for this, because fglrx
 allocates a lot of pages in its suspend routine if DRI is enabled. I
 think some other drivers do too, but fglrx is the main one I know.

I wasn't aware of that, thanks for the information.

I think this means we'll probably need to add a tunable, similar to 
image_size,
that will allow the users to specify how much spare memory they want to 
reserve
for suspending (instead of the constant PAGES_FOR_IO).  IMO we can call 
it
'spare_memory'.
   
   Just increase PAGES_FOR_IO. This should not be tunable.
  
  Well, I'm not sure.  First, we don't really know what the value of it 
  should be
  and this alone is a good enough reason for making it tunable, IMHO.  
  Second, I
  think different systems may need different PAGES_FOR_IO and taking just the
  maximum (even if we learn how much that actually is) seems to be wasteful in
 
 Well,  it is wasteful as in we save slightly smaller image than we
 could. That's okay with me.

No. If the driver can't allocate the memory, your call to device_suspend
will fail. This isn't about image size but about success or failure to
hibernate.

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] swsusp problem: Drivers allocate much memory during suspend (was: Re: 2.6.21-rc5: swsusp: Not enough free memory)

2007-04-13 Thread Nigel Cunningham
Hi.

On Sat, 2007-04-14 at 00:57 +0200, Rafael J. Wysocki wrote:
Well, I'm not sure.  First, we don't really know what the value of it 
should be
and this alone is a good enough reason for making it tunable, IMHO.  
Second, I
think different systems may need different PAGES_FOR_IO and taking just 
the
maximum (even if we learn how much that actually is) seems to be 
wasteful in
   
   Well,  it is wasteful as in we save slightly smaller image than we
   could. That's okay with me.
  
  No. If the driver can't allocate the memory, your call to device_suspend
  will fail. This isn't about image size but about success or failure to
  hibernate.
 
 If we take PAGES_FOR_IO to be the maximum over all possible configurations
 that can hibernate, the majority of systems will just create smaller images 
 than
 they could have created for smaller PAGES_FOR_IO, but all of them will be
 able to hibernate. :-)

You also use PAGES_FOR_IO in enough_free_mem. Say you set it to the 9000
pages I mentioned before (35M). On a machine with 64 megabytes of
memory, you'll never be able to suspend because you'll never satisfy

free  nr_pages + PAGES_FOR_IO + meta

I'll freely admit that 64 megabytes is tiny nowadays, but it's not
completely unknown. The point is really that you're effectively making
swsusp unusable for machines with RAM  (PAGES_FOR_IO * (say) 3). But
what do you set PAGES_FOR_IO to? There'll always be someone with
$WHIZ_BANG_CONFIG who is pushing to have the value increased, and every
increase knocks out more of your lowend users.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Problem with freezable workqueues

2007-02-27 Thread Nigel Cunningham
Hi.

On Wed, 2007-02-28 at 01:08 +0100, Rafael J. Wysocki wrote:
 On Wednesday, 28 February 2007 01:01, Johannes Berg wrote:
  On Wed, 2007-02-28 at 00:57 +0100, Rafael J. Wysocki wrote:
  
   Okay, in that case I'd suggest removing create_freezeable_workqueue() and
   make all workqueues nonfreezable once again for 2.6.21 (as far as I know, 
   only
   the two XFS workqueues are affected).
  
  I think Nigel might object but I forgot what specific trouble XFS was
  causing him.
 
 We suspected that the XFS' worker threads might commit I/O after
 freeze_processes() has returned, but that hasn't been supported by evidence,
 as far as I can recall.
 
 Also, making them freezable was controversial ...

Controversy is no reason to give in! Nevertheless, I think you're right
- I believe the XFS guys said they fixed the issue that had caused I/O
to be submitted post-freeze. Well, we'll see if it appears again, won't
we?

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Resume from S2R fails after dpm_resume()

2007-03-02 Thread Nigel Cunningham
Hi.

On Fri, 2007-03-02 at 07:25 -0700, Tim Gardner wrote:
 Pavel Machek wrote:
  Hi!
  
  I instrumented 2.6.21-rc1 base/power/resume.c device_resume() with
  TRACE_RESUME(0) as the last statement in the function. Sure enough it
  was the last hash value in the RTC after a hard reboot when resume failed:
 
  [   12.028820]   hash matches drivers/base/power/resume.c:104
 
  The machine appears to be absolutely wedged after initiating resume by
  pressing the power button. The disk flashes for a half second or so,
  then thats it.
 
  It is a Dell XPS, BIOS rev A04. I'm using 'echo 1  /sys/power/pm_trace;
  echo mem  /sys/power/state' to initiate the S2R sequence.
 
  Any suggestions on where to go from here?
  
  Did it work ok in 2.6.20? Can you try to get video working/get serial
  console/something?
  Pavel
 
 Pavel,
 
 The last version that worked well was Ubuntu Edgy (2.6.17). It was
 broken by 2.6.18. I have not started the 'git bisect' process, instead
 I've been trying to figure out why it doesn't work in 2.6.21-rc2. Using
 the TRACE_RESUME macro I've drilled down to
 kernel/printk.c:__call_console_drivers. So far the last trace info that
 I have is just before the call to con-write(). I'm trying to figure out
 what driver has registered as the console (intel_agp or agpgart?).
 
 Am I banging my head on a known problem?

Tim, it's possible that the problem you're seeing is completely
different to the one Pavel is looking for. Given that you're down to
looking in console write code, I wonder if it's related to the changes
to console suspending that were done around that time. I'd suggest
either looking in LKML or Linux-PM archives for a commit related to
suspending the console, or doing your git bisect.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] [PATCH 2/2 -stable] libata: add missing CONFIG_PM in LLDs

2007-03-02 Thread Nigel Cunningham
Hi.

On Fri, 2007-03-02 at 17:46 +0900, Tejun Heo wrote:
 Add missing #ifdef CONFIG_PM conditionals around all PM related parts
 in libata LLDs.
 
 Signed-off-by: Tejun Heo [EMAIL PROTECTED]
 ---
  drivers/ata/ahci.c  |   14 ++
  drivers/ata/ata_generic.c   |4 
  drivers/ata/ata_piix.c  |4 
  drivers/ata/pata_ali.c  |6 ++
  drivers/ata/pata_amd.c  |6 ++
  drivers/ata/pata_atiixp.c   |4 
  drivers/ata/pata_cmd64x.c   |6 ++
  drivers/ata/pata_cs5520.c   |7 +++
  drivers/ata/pata_cs5530.c   |6 ++
  drivers/ata/pata_cs5535.c   |4 
  drivers/ata/pata_cypress.c  |4 
  drivers/ata/pata_efar.c |4 
  drivers/ata/pata_hpt366.c   |7 ++-
  drivers/ata/pata_hpt3x3.c   |6 ++
  drivers/ata/pata_it821x.c   |6 ++
  drivers/ata/pata_jmicron.c  |4 
  drivers/ata/pata_marvell.c  |4 
  drivers/ata/pata_mpiix.c|4 
  drivers/ata/pata_netcell.c  |4 
  drivers/ata/pata_ns87410.c  |4 
  drivers/ata/pata_oldpiix.c  |4 
  drivers/ata/pata_opti.c |4 
  drivers/ata/pata_optidma.c  |4 
  drivers/ata/pata_pdc202xx_old.c |4 
  drivers/ata/pata_radisys.c  |4 
  drivers/ata/pata_rz1000.c   |6 ++
  drivers/ata/pata_sc1200.c   |4 
  drivers/ata/pata_serverworks.c  |6 ++
  drivers/ata/pata_sil680.c   |4 
  drivers/ata/pata_sis.c  |4 
  drivers/ata/pata_triflex.c  |4 
  drivers/ata/pata_via.c  |6 ++
  drivers/ata/sata_sil.c  |2 ++
  drivers/ata/sata_sil24.c|2 ++
  34 files changed, 165 insertions(+), 1 deletion(-)
 
 Index: work1/drivers/ata/ahci.c
 ===
 --- work1.orig/drivers/ata/ahci.c
 +++ work1/drivers/ata/ahci.c
 @@ -225,10 +225,12 @@ static void ahci_thaw(struct ata_port *a
  static void ahci_error_handler(struct ata_port *ap);
  static void ahci_vt8251_error_handler(struct ata_port *ap);
  static void ahci_post_internal_cmd(struct ata_queued_cmd *qc);
 +#ifdef CONFIG_PM
  static int ahci_port_suspend(struct ata_port *ap, pm_message_t mesg);
  static int ahci_port_resume(struct ata_port *ap);
  static int ahci_pci_device_suspend(struct pci_dev *pdev, pm_message_t mesg);
  static int ahci_pci_device_resume(struct pci_dev *pdev);

Wouldn't it be simpler to add

#else
#define ahci_port_suspend(port, message) (NULL)

etc (or something similar)?

Regards,

Nigel



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] [PATCH 2/2 -stable] libata: add missing CONFIG_PM in LLDs

2007-03-02 Thread Nigel Cunningham
Hi.

On Sat, 2007-03-03 at 12:20 +0900, Tejun Heo wrote:
 Hello, Nigel.
 
 Nigel Cunningham wrote:
  Index: work1/drivers/ata/ahci.c
  ===
  --- work1.orig/drivers/ata/ahci.c
  +++ work1/drivers/ata/ahci.c
  @@ -225,10 +225,12 @@ static void ahci_thaw(struct ata_port *a
   static void ahci_error_handler(struct ata_port *ap);
   static void ahci_vt8251_error_handler(struct ata_port *ap);
   static void ahci_post_internal_cmd(struct ata_queued_cmd *qc);
  +#ifdef CONFIG_PM
   static int ahci_port_suspend(struct ata_port *ap, pm_message_t mesg);
   static int ahci_port_resume(struct ata_port *ap);
   static int ahci_pci_device_suspend(struct pci_dev *pdev, pm_message_t 
  mesg);
   static int ahci_pci_device_resume(struct pci_dev *pdev);
  
  Wouldn't it be simpler to add
  
  #else
  #define ahci_port_suspend(port, message) (NULL)
  
  etc (or something similar)?
 
 ahci_port_suspend() is used to fill ata_port_ops vector, so it needs to
 be a function.  If you're talking about defining NULL function, yeah,
 that will remove half of CONFIG_PMs but would require dummy definitions
 for all functions.  I think both are ugly.  :-)

Yeah, I didn't look really carefully; an empty static function would
have been what I'd have written if I'd paid more attention.

 I'm working on a linker trick.  Please take a look at the following thread.
 
   http://thread.gmane.org/gmane.linux.ide/16475

Not familiar with fancy things like that, so I'll just pipe down and
leave you to it :).

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Problem with freezable workqueues

2007-03-06 Thread Nigel Cunningham
Hi.

On Tue, 2007-03-06 at 21:31 +0100, Rafael J. Wysocki wrote:
 Hi,
 
 On Tuesday, 6 March 2007 01:30, Johannes Berg wrote:
  On Tue, 2007-02-27 at 22:51 +0100, Rafael J. Wysocki wrote:
  
   For 2.6.21-rc1 I've invented the appended workaround (works for me, 
   waiting for
   Johannes to confirm it works for him too), but I think we need something 
   better
   for -mm and future kernels.
  
  Finally I could get back to this but after reading the thread I figured
  it might not be necessary to test this. Please let me know ASAP if you
  want this patch tested as well or it'll take quite a long time (going
  skiing for a week on Saturday)
 
 I think it won't be necessary.
 
 For now, we have decided to make the workqueues nonfreezable (the patch for
 that has already been merged, AFAICT).
 
  In any case, I made the two xfs workqueues non-freezable and everything
  on my quad powermac works again, I also couldn't detect any filesystem
  correction.
 
 Good, thanks for the confirmation.
 
  I wanted to adapt the BUG_ON(block IO not from suspend code) 
  patch from suspend2 but haven't gotten around to it yet.
 
 That might be a good idea for other reasons too, but I'd prefer WARN_ON()
 instead of BUG_ON() when you're at it. ;-)

I made it BUG_ON() because if Suspend2 is running any I/O coming from
another source besides Suspend2 may be I/O on a page that's been used
for the atomic copy, and in that case it would definitely be bad to
write it to disk. If swsusp is running, the BUG_ON() won't trigger IIRC.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Radeon xpress 200m and radeonfb kinda work

2007-03-07 Thread Nigel Cunningham
Hi.

On Tue, 2007-03-06 at 01:16 +0100, Johan Henriksson wrote:
 Hi!
 
 I have gotten the radeon xpress 200m (the version without dedicated
 vmem)
  to work with radeonfb.
 The attached patch (against linux-2.6.20.1) works for me.
 Since I don't have any docs for the card I am unsure if the patch is 
 100% correct.
 Can someone else with a 200m try it out?
 (I have tested it by enabling  fbcon and radeonfb in the kernel  and
  added video=radeonfb to lilo. This gave me a nice 1280x800
 console :) )
 
 /Johan Henriksson
 
 Please CC, I'm not on the list.
 
 @@ -2329,7 +2332,7 @@ static int __devinit radeonfb_pci_regist
   /* -2 is special: means  ON on mobility chips and do not
* change on others
*/
 - radeonfb_pm_init(rinfo, rinfo-is_mobility ? 1 : -1, 
 ignore_devlist, force_sleep);
 + radeonfb_pm_init(rinfo, -1,ignore_devlist, 
 force_sleep);//rinfo-is_mobility ? 1 : -1);

That looks like it might break !200M. Maybe something line
rinfo-is_mobility  !rinfo-rs480 (with additional modifications to
define an rs480, of course) - or a more generic name indicating why the
rs480 is different?

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/20] x86_64 Relocatable bzImage support (V4)

2007-03-07 Thread Nigel Cunningham
Hi.

On Wed, 2007-03-07 at 07:07 -0800, Arjan van de Ven wrote:
 On Wed, 2007-03-07 at 12:27 +0530, Vivek Goyal wrote:
  Hi,
  
  Here is another attempt on x86_64 relocatable bzImage patches(V4). This
  patchset makes a bzImage relocatable and same kernel binary can be loaded
  and run from different physical addresses.
 
 
 have these patches been extensively tested with various suspend
 scenarios? (S1,S3,S4 in acpi speak or s2ram and s2disk in Linux speak)

We did work on this for RHEL5, getting relocatable kernel support
working fine with S4. While doing it and since, I've been running
Suspend2 with the same patch.

Since that work, Vivek has done more modifications, but I can confirm
that the basic design is reliable with S4. Haven't tried S3, but can do.
Will report back shortly.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/20] x86_64 Relocatable bzImage support (V4)

2007-03-07 Thread Nigel Cunningham
Hi.

On Thu, 2007-03-08 at 07:49 +1100, Nigel Cunningham wrote:
 Hi.
 
 On Wed, 2007-03-07 at 07:07 -0800, Arjan van de Ven wrote:
  On Wed, 2007-03-07 at 12:27 +0530, Vivek Goyal wrote:
   Hi,
   
   Here is another attempt on x86_64 relocatable bzImage patches(V4). This
   patchset makes a bzImage relocatable and same kernel binary can be loaded
   and run from different physical addresses.
  
  
  have these patches been extensively tested with various suspend
  scenarios? (S1,S3,S4 in acpi speak or s2ram and s2disk in Linux speak)
 
 We did work on this for RHEL5, getting relocatable kernel support
 working fine with S4. While doing it and since, I've been running
 Suspend2 with the same patch.
 
 Since that work, Vivek has done more modifications, but I can confirm
 that the basic design is reliable with S4. Haven't tried S3, but can do.
 Will report back shortly.

S3 works okay here with a relocatable x86_64 kernel (2.6.20).

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 16/20] swsusp: do not use virt_to_page on kernel data address

2007-03-07 Thread Nigel Cunningham
Hi.

On Wed, 2007-03-07 at 23:50 +0100, Pavel Machek wrote:
 Hi!
 
  o virt_to_page() call should be used on kernel linear addresses and not
on kernel text and data addresses. Swsusp code uses it on kernel data
(statically allocated swsusp_header).
  
  o Allocate swsusp_header dynamically so that virt_to_page() can be used
safely.
  
  o I am changing this because in next few patches, __pa() on x86_64 will
no longer support kernel text and data addresses and hibernation breaks. 
  
  Signed-off-by: Vivek Goyal [EMAIL PROTECTED]
 
 (I assume this was tested, too?)

Absolutely.

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/20] x86_64 Relocatable bzImage support (V4)

2007-03-08 Thread Nigel Cunningham
Hi.

On Thu, 2007-03-08 at 10:10 +0530, Vivek Goyal wrote:
 On Thu, Mar 08, 2007 at 10:15:02AM +1100, Nigel Cunningham wrote:
  Hi.
  
  On Thu, 2007-03-08 at 07:49 +1100, Nigel Cunningham wrote:
   Hi.
   
   On Wed, 2007-03-07 at 07:07 -0800, Arjan van de Ven wrote:
On Wed, 2007-03-07 at 12:27 +0530, Vivek Goyal wrote:
 Hi,
 
 Here is another attempt on x86_64 relocatable bzImage patches(V4). 
 This
 patchset makes a bzImage relocatable and same kernel binary can be 
 loaded
 and run from different physical addresses.


have these patches been extensively tested with various suspend
scenarios? (S1,S3,S4 in acpi speak or s2ram and s2disk in Linux speak)
   
   We did work on this for RHEL5, getting relocatable kernel support
   working fine with S4. While doing it and since, I've been running
   Suspend2 with the same patch.
   
   Since that work, Vivek has done more modifications, but I can confirm
   that the basic design is reliable with S4. Haven't tried S3, but can do.
   Will report back shortly.
  
  S3 works okay here with a relocatable x86_64 kernel (2.6.20).
  
 
 Hi Nigel,
 
 Is it possible to test S3 with 2.6.21-rc2 kernels also. Right now I don't 
 have access to any machine supporting S3. I tested it at the time of my last
 posting and it had worked well. Appreciate your help.

Tested with rc3 (rc2 wouldn't compile), and it works fine.

If you're willing, please add

Signed-off-by: Nigel Cunningham [EMAIL PROTECTED]

or

Acked-by: Nigel Cunningham [EMAIL PROTECTED]

to the hibernation related parts as you see appropriate, since I helped
(albeit in a minor way compared to your work and Eric's work) with
preparing and testing them for RHEL5 and have confirmed they're still ok
in this version.

Regards,

Nigel

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 7/9] unprivileged mounts: allow unprivileged fuse mounts

2008-01-08 Thread Nigel Cunningham
Hi.

Miklos Szeredi wrote:
 On Tue 2008-01-08 12:35:09, Miklos Szeredi wrote:
 From: Miklos Szeredi [EMAIL PROTECTED]

 Use FS_SAFE for fuse fs type, but not for fuseblk.

 FUSE was designed from the beginning to be safe for unprivileged users.  
 This
 has also been verified in practice over many years.  In addition 
 unprivileged
 Eh? So 'kill -9 no longer works' and 'suspend no longer works' is not
 considered important enough to even mention?
 
 No.  Because in practice they don't seem to matter.  Also because
 there's no way in which fuse could be done differently to address
 these issues.

Could you clarify, please? I hope I'm getting the wrong end of the stick
- it sounds to me like you and Pavel are saying that this patch breaks
suspending to ram (and hibernating?) but you want to push it anyway
because you haven't been able to produce an instance, don't think
suspending or hibernating matter and couldn't fix fuse anyway?

 The 'kill -9' thing is basically due to VFS level locking not being
 interruptible.  It could be changed, but I'm not sure it's worth it.
 
 For the suspend issue, there are also no easy solutions.

What are the non-easy solutions?

Regards,

Nigel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 7/9] unprivileged mounts: allow unprivileged fuse mounts

2008-01-09 Thread Nigel Cunningham
Hi.

Miklos Szeredi wrote:
 On Tue 2008-01-08 12:35:09, Miklos Szeredi wrote:
 From: Miklos Szeredi [EMAIL PROTECTED]

 Use FS_SAFE for fuse fs type, but not for fuseblk.

 FUSE was designed from the beginning to be safe for unprivileged users.  
 This
 has also been verified in practice over many years.  In addition 
 unprivileged
 Eh? So 'kill -9 no longer works' and 'suspend no longer works' is not
 considered important enough to even mention?
 No.  Because in practice they don't seem to matter.  Also because
 there's no way in which fuse could be done differently to address
 these issues.
 Could you clarify, please? I hope I'm getting the wrong end of the stick
 - it sounds to me like you and Pavel are saying that this patch breaks
 suspending to ram (and hibernating?) but you want to push it anyway
 because you haven't been able to produce an instance, don't think
 suspending or hibernating matter and couldn't fix fuse anyway?
 
 This patch has nothing to do with suspend or hibernate.  What this
 patchset does, is help get rid of fusermount, a suid-root mount
 helper.  It also opens up new possibilities, which are not fuse
 related.

That's what I thought. So what was Pavel talking about with kill -9 no
longer works and suspend no longer works above? I couldn't understand
it from the context.

 Fuse has bad interactions with the freezer, theoretically.  In
 practice, I remember just one bug report (that sparked off this whole
 do we need freezer, or don't we flamefest), that actually got fixed
 fairly quickly, ...maybe.  Rafael probably remembers better.

I think they just gave up and considered it unsolvable. I'm not sure it is.

 The 'kill -9' thing is basically due to VFS level locking not being
 interruptible.  It could be changed, but I'm not sure it's worth it.

 For the suspend issue, there are also no easy solutions.
 What are the non-easy solutions?
 
 The ability to freeze tasks in uninterruptible sleep, or more
 generally at any preempt point (except when drivers are poking
 hardware).

Couldn't some sort of scheduler based solution deal with the
uninterruptible sleeping case?

 I know this doesn't play well with userspace hibernate, and I don't
 think it can be resolved without going the kexec way.

I can see the desirability of kexec when it comes to avoiding the
freezer, but comes with its own problems too - having the original
context usable is handy, not having to set aside a large amount of space
for a second kernel is also desirable and there are still greater issues
of transferring information backwards and forwards between the two kernels.

Regards,

Nigel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3 -mm] kexec jump -v8

2007-12-21 Thread Nigel Cunningham
Hi.

Huang, Ying wrote:
 This patchset provides an enhancement to kexec/kdump. It implements
 the following features:
 
 - Backup/restore memory used both by the original kernel and the
   kexeced kernel.

Why the kexeced kernel as well?

[...]

 The features of this patchset can be used as follow:
 
 - Kernel/system debug through making system snapshot. You can make
   system snapshot, jump back, do some thing and make another system
   snapshot.

Are you somehow recording all the filesystem changes after the first
snapshot? If not, this is pointless (you'll end up with filesystem
corruption).

[...]

 - Cooperative multi-kernel/system. With kexec jump, you can switch
   between several kernels/systems quickly without boot process except
   the first time. This appears like swap a whole kernel/system out/in.

How is this useful to the end user?

Regards,

Nigel


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFT] Port 0x80 I/O speed

2007-12-11 Thread Nigel Cunningham
Rene Herman wrote:
 Good day.
 
 Would some people on x86 (both 32 and 64) be kind enough to compile and
 run the attached program? This is about testing how long I/O port access
 to port 0x80 takes. It measures in CPU cycles so CPU speed is crucial in
 reporting.
 
 Posted a previous incarnation of this before, buried in the outb 0x80
 thread which had a serialising problem. This one should as far as I can
 see measure the right thing though. Please yell if you disagree...
 
 For me, on a Duron 1300 (AMD756 chipset) I have a constant:
 
 [EMAIL PROTECTED]:~/src/port80$ su -c ./port80
 cycles: out 2400, in 2400
 
 and on a PII 400 (Intel 440BX chipset) a constant:
 
 [EMAIL PROTECTED]:~/src/port80$ su -c ./port80
 cycles: out 553, in 251
 
 Results are (mostly) independent of compiler optimisation, but testing
 with an -O2 compile should be most useful. Thanks!

(AMD 1.8GHz Turion, running at 800MHz. ATI RS480 - Mitac 8350 mobo)

[EMAIL PROTECTED]:~/Downloads$ gcc port80.c -o port80
[EMAIL PROTECTED]:~/Downloads$ sudo ./port80
cycles: out 1235, in 1207
[EMAIL PROTECTED]:~/Downloads$ sudo ./port80
cycles: out 1238, in 1205
[EMAIL PROTECTED]:~/Downloads$ sudo ./port80
cycles: out 1237, in 1209
[EMAIL PROTECTED]:~/Downloads$ gcc -O2 port80.c -o port80
[EMAIL PROTECTED]:~/Downloads$ sudo ./port80
cycles: out 1844674407370794, in 1844674407369408
[EMAIL PROTECTED]:~/Downloads$ sudo ./port80
cycles: out 1844674407370795, in 1844674407369404
[EMAIL PROTECTED]:~/Downloads$ sudo ./port80
cycles: out 1844674407370795, in 1844674407369409
[EMAIL PROTECTED]:~/Downloads$ sudo ./port80
cycles: out 1844674407370798, in 1844674407369407
[EMAIL PROTECTED]:~/Downloads$ cat /proc/cpuinfo
processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 15
model   : 36
model name  : AMD Turion(tm) 64 Mobile Technology ML-34
stepping: 2
cpu MHz : 800.000
cache size  : 1024 KB
fpu : yes
fpu_exception   : yes
cpuid level : 1
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt
lm 3dnowext 3dnow rep_good pni lahf_lm
bogomips: 1592.87
TLB size: 1024 4K pages
clflush size: 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc


Regards,

Nigel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFT] Port 0x80 I/O speed

2007-12-11 Thread Nigel Cunningham
Rene Herman wrote:
 On 12-12-07 00:55, Nigel Cunningham wrote:
 
 (AMD 1.8GHz Turion, running at 800MHz. ATI RS480 - Mitac 8350 mobo)

 [EMAIL PROTECTED]:~/Downloads$ gcc port80.c -o port80
 [EMAIL PROTECTED]:~/Downloads$ sudo ./port80
 cycles: out 1235, in 1207
 
 Looking good.
 
 [EMAIL PROTECTED]:~/Downloads$ gcc -O2 port80.c -o port80
 [EMAIL PROTECTED]:~/Downloads$ sudo ./port80
 cycles: out 1844674407370794, in 1844674407369408
 
 Obviously not. I suppose this changes with -m32 on the GCC command line?
 (sorry for missing that, I have no 64-bit machines).

Yes, it does:

[EMAIL PROTECTED]:~/Downloads$ gcc -m32 -o port80 port80.c
[EMAIL PROTECTED]:~/Downloads$ sudo ./port80
cycles: out 1231, in 1208
[EMAIL PROTECTED]:~/Downloads$ sudo ./port80
cycles: out 1233, in 1210

Incidentally:

[EMAIL PROTECTED]:~/Downloads$ processor_speed

(A little script I made because my lappy does a solid lock every now and
then that seems to be cpu-freq related - locking it to one frequency
makes the lock far less common).

Speed is now 180.
[EMAIL PROTECTED]:~/Downloads$ sudo ./port80
cycles: out 2472, in 2505
[EMAIL PROTECTED]:~/Downloads$ sudo ./port80
cycles: out 2489, in 2515
[EMAIL PROTECTED]:~/Downloads$ sudo ./port80
cycles: out 2481, in 2503
[EMAIL PROTECTED]:~/Downloads$ sudo ./port80
cycles: out 2476, in 2507

So the same effect Maxim reported is seen here.

Regards,

Nigel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in evdev_disconnect for kernel 2.6.23.12

2008-01-01 Thread Nigel Cunningham
Hi Berthold.

Berthold Cogel wrote:
 Jan  1 17:34:39 wonderland kernel: usb 2-2: USB disconnect, address 3
 Jan  1 17:34:39 wonderland kernel: usb 2-2.5: USB disconnect, address 4
 Jan  1 17:34:39 wonderland kernel: drivers/input/tablet/wacom_sys.c:
 wacom_sys_irq - usb_submit_urb failed with result -19
 Jan  1 17:34:39 wonderland kernel: usb 2-2.6: USB disconnect, address 5
 Jan  1 17:34:39 wonderland kernel: BUG: unable to handle kernel paging
 request at virtual address 00100100
 Jan  1 17:34:39 wonderland kernel:  printing eip:
 Jan  1 17:34:39 wonderland kernel: f8819668
 Jan  1 17:34:39 wonderland kernel: *pde = 
 Jan  1 17:34:39 wonderland kernel: Oops:  [#1]
 Jan  1 17:34:39 wonderland kernel: PREEMPT
 Jan  1 17:34:39 wonderland kernel: Modules linked in: isofs
 nls_iso8859_1 nls_cp437 vfat fat radeon drm rfcomm l2cap bluetooth ppdev
 lp fan ac battery joydev dm_crypt wacom dm_snapshot dm_mirror sr_mod
 sd_mod sbp2 usbhid hid ff_memless usb_storage snd_emu10k1_synth
 snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_emu10k1
 firmware_class snd_ac97_codec ac97_bus snd_util_mem snd_hwdep
 snd_pcm_oss snd_pcm snd_page_alloc snd_mixer_oss snd_seq_dummy
 snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq
 snd_timer snd_seq_device parport_pc parport rtc i2c_viapro ohci1394
 via_agp ide_cd agpgart snd ehci_hcd emu10k1_gp gameport 8139too
 soundcore thermal uhci_hcd ieee1394 processor button evdev
 Jan  1 17:34:39 wonderland kernel: CPU:0
 Jan  1 17:34:39 wonderland kernel: EIP:0060:[f8819668]Not
 tainted VLI
 Jan  1 17:34:39 wonderland kernel: EFLAGS: 00010206   (2.6.23.12 #1)
 Jan  1 17:34:39 wonderland kernel: EIP is at evdev_disconnect+0x65/0x9e
 [evdev]
 Jan  1 17:34:39 wonderland kernel: eax:    ebx: 000ffcf0   ecx:
 c1926760   edx: 0033
 Jan  1 17:34:39 wonderland kernel: esi: f7415600   edi: f741564c   ebp:
 f7415654   esp: c1967e68
 Jan  1 17:34:39 wonderland kernel: ds: 007b   es: 007b   fs:   gs:
   ss: 0068
 Jan  1 17:34:39 wonderland kernel: Process khubd (pid: 136, ti=c1966000
 task=c1926570 task.ti=c1966000)
 Jan  1 17:34:39 wonderland kernel: Stack: f7415800 f7402000 f7402758
 f740276c f7b94458 c03454b2  c03c6eb6
 Jan  1 17:34:39 wonderland kernel:f7bda054 c029178a f788f520
 f7bda000 f9b3c608 f9b3a3ab f7bda000 f7bda000
 Jan  1 17:34:39 wonderland kernel:f7bda01c c0337954 f7bda01c
 f9b3c638  c02fdb59 f7bda01c f7bda01c
 Jan  1 17:34:39 wonderland kernel: Call Trace:
 Jan  1 17:34:39 wonderland kernel:  [c03454b2]
 input_unregister_device+0x6f/0xff
 Jan  1 17:34:39 wonderland kernel:  [c03c6eb6] klist_release+0x27/0x30
 Jan  1 17:34:39 wonderland kernel:  [c029178a] kref_put+0x5f/0x6c
 Jan  1 17:34:39 wonderland kernel:  [f9b3a3ab]
 wacom_disconnect+0x2b/0x66 [wacom]
 Jan  1 17:34:39 wonderland kernel:  [c0337954]
 usb_unbind_interface+0x2d/0x6e
 Jan  1 17:34:39 wonderland kernel:  [c02fdb59]
 __device_release_driver+0x6e/0x8b
 Jan  1 17:34:39 wonderland kernel:  [c02fdeaf]
 device_release_driver+0x1d/0x32
 Jan  1 17:34:39 wonderland kernel:  [c02fd599]
 bus_remove_device+0x6a/0x7a
 Jan  1 17:34:39 wonderland kernel:  [c02fbde3] device_del+0x1c3/0x234
 Jan  1 17:34:39 wonderland kernel:  [c033567f]
 usb_disable_device+0x5c/0xbb
 Jan  1 17:34:39 wonderland kernel:  [c0331ff9] usb_disconnect+0x7e/0xe6
 Jan  1 17:34:39 wonderland kernel:  [c0331fea] usb_disconnect+0x6f/0xe6
 Jan  1 17:34:39 wonderland kernel:  [c03324db] hub_thread+0x31c/0xa10
 Jan  1 17:34:39 wonderland kernel:  [c0114e17] update_curr+0x102/0x12c
 Jan  1 17:34:39 wonderland kernel:  [c0114a13]
 update_stats_wait_end+0x96/0xb9
 Jan  1 17:34:39 wonderland kernel:  [c01281c7]
 autoremove_wake_function+0x0/0x33
 Jan  1 17:34:39 wonderland kernel:  [c03321bf] hub_thread+0x0/0xa10
 Jan  1 17:34:39 wonderland kernel:  [c012810e] kthread+0x36/0x5c
 Jan  1 17:34:39 wonderland kernel:  [c01280d8] kthread+0x0/0x5c
 Jan  1 17:34:39 wonderland kernel:  [c01048f7]
 kernel_thread_helper+0x7/0x10
 Jan  1 17:34:39 wonderland kernel:  ===
 Jan  1 17:34:39 wonderland kernel: Code: 5e 4c 81 eb 10 04 00 00 eb 21
 8d 83 08 04 00 00 b9 06 00 02 00 ba 1d 00 00 00 e8 6a 93 95 c7 8b 9b 10
 04 00 00 81 eb 10 04 00 00 8b 83 10 04 00 00 0f 18 00 90 8d 83 10 04
 00 00 39 f8 75 cb 8d
 Jan  1 17:34:39 wonderland kernel: EIP: [f8819668]
 evdev_disconnect+0x65/0x9e [evdev] SS:ESP 0068:c1967e68
 
 
 I'm using Debian stable/testing/unstable with homemade kernel 2.6.23.12
 (patched with tuxonice-3.0-rc3-for-2.6.23.9).
 
 I tried to get my Wacom Bamboo grafic tablet to work with linux and the
 xorg driver from linuxwacom-0.7.9-4
 (http://linuxwacom.sourceforge.net/). After 'configure/make/make
 install' from source and configuring Xorg, I got the tablet working for
 a simple user. But each time I tried to login with X as root (I know
  Bad idea  :-)) xserver got restarted. I tried to trace the
 situation with stracing gdm. I did this via an ssh 

What's in store for 2008 for TuxOnIce?

2008-01-01 Thread Nigel Cunningham
Hi all.

With the start of a new year, I suppose it's a good time to think about
what I'd like to do with TuxOnIce this year and see what feedback I get.

First up, I'm thinking about closing the mailing lists and asking people
to use LKML instead for reporting issues and so on. I'm thinking about
this because it will help with allowing people who work on mainline to
see how stable (or otherwise!) TuxOnIce is now. It should also help when
(as often happens) bug reports aren't actually issues with the patch,
but with vanilla (ie drivers). Perhaps it will also help with whatever
effort I find time to make towards convincing Andrew that it really does
have significant advantages over [u]swsusp and kexec based hibernation.

Secondly, I'm planning on moving the website soonish. It's taken longer
than I planned because it will be sharing with another server I'm
maintaining, and it has taken longer than expected to find good hosting
for the other server (which was done first). Now that I'm happy with the
other server's state, I'm hoping to start shifting
suspend2.net/tuxonice.net soon.

For those who might be looking for hosting themselves, I'm using
slicehost. I initially tried GoDaddy, but had terrible service, problems
with draconian limits on the volume of outgoing email (1000/day by
default - useless if you're doing mailing lists) and unexpected,
unexplained delays in mail delivery through the SMTP delay they force
you to use. Slicehost, on the other hand, are terrific to deal with in
everyway. If you sign up with them because of this email, please
consider putting my email (nigel at suspend2.net) as the referrer - I
then get a discount on the cost of the hosting.

Third, regarding the patch itself, I'm taking my time in working towards
the 3.0 release. We don't have any major bugs with 3.0-rc3 reported, but
I have some things I want to complete before the final release:
* see it well tested;
* get a finished initial version of the cluster support;
* finish completing support for the new resume-from-other kernels
functionality that Rafael has added in 2.6.24. (We can resume from the
same kernel at the moment, but I need to convince myself that nosave
data is properly handled).

Regards,

Nigel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Freezing filesystems (Was Re: What's in store for 2008 for TuxOnIce?)

2008-01-01 Thread Nigel Cunningham
Hi.

Rafael J. Wysocki wrote:
 On Tuesday, 1 of January 2008, Nigel Cunningham wrote:
 Hi all.
 
 Hi Nigel,

Gidday :)

 With the start of a new year, I suppose it's a good time to think about
 what I'd like to do with TuxOnIce this year and see what feedback I get.

 First up, I'm thinking about closing the mailing lists and asking people
 to use LKML instead for reporting issues and so on. I'm thinking about
 this because it will help with allowing people who work on mainline to
 see how stable (or otherwise!) TuxOnIce is now. It should also help when
 (as often happens) bug reports aren't actually issues with the patch,
 but with vanilla (ie drivers).
 
 I would also like the TuxOnIce issues related to drivers, ACPI, etc. to go to
 one of the kernel-related lists, but I think linux-pm may be better for that
 due to the much lower traffic.

I guess that makes sense. I guess people can always be referred to LKML
for the issues where the appropriate person isn't on linux-pm.

 Perhaps it will also help with whatever effort I find time to make towards
 convincing Andrew that it really does have significant advantages over
 [u]swsusp and kexec based hibernation. 

 Secondly, I'm planning on moving the website soonish. It's taken longer
 than I planned because it will be sharing with another server I'm
 maintaining, and it has taken longer than expected to find good hosting
 for the other server (which was done first). Now that I'm happy with the
 other server's state, I'm hoping to start shifting
 suspend2.net/tuxonice.net soon.

 For those who might be looking for hosting themselves, I'm using
 slicehost. I initially tried GoDaddy, but had terrible service, problems
 with draconian limits on the volume of outgoing email (1000/day by
 default - useless if you're doing mailing lists) and unexpected,
 unexplained delays in mail delivery through the SMTP delay they force
 you to use. Slicehost, on the other hand, are terrific to deal with in
 everyway. If you sign up with them because of this email, please
 consider putting my email (nigel at suspend2.net) as the referrer - I
 then get a discount on the cost of the hosting.

 Third, regarding the patch itself, I'm taking my time in working towards
 the 3.0 release. We don't have any major bugs with 3.0-rc3 reported, but
 I have some things I want to complete before the final release:
 * see it well tested;
 * get a finished initial version of the cluster support;
 * finish completing support for the new resume-from-other kernels
 functionality that Rafael has added in 2.6.24. (We can resume from the
 same kernel at the moment, but I need to convince myself that nosave
 data is properly handled).
 
 Have you finished the support for freezing filesystems before freezing tasks
 that we talked about some time ago?

Hmm. I've had too many things going through my little brain since then.
What I currently have is support for freezing fuse filesystems
separately. It looks like:

int freeze_processes(void)
{
int error;

printk(Stopping fuse filesystems.\n);
freeze_filesystems(FS_FREEZER_FUSE);
freezer_state = FREEZER_FILESYSTEMS_FROZEN;
printk(Freezing user space processes ... );
error = try_to_freeze_tasks(FREEZER_USER_SPACE);
if (error)
goto Exit;
printk(done.\n);

sys_sync();
printk(Stopping normal filesystems.\n);
freeze_filesystems(FS_FREEZER_NORMAL);
freezer_state = FREEZER_USERSPACE_FROZEN;
printk(Freezing remaining freezable tasks ... );
error = try_to_freeze_tasks(FREEZER_KERNEL_THREADS);
if (error)
goto Exit;
printk(done.);
freezer_state = FREEZER_FULLY_ON;
 Exit:
BUG_ON(in_atomic());
printk(\n);
return error;
}

(I'm not yet worrying about ext3 on fuse or such like, but it shouldn't
be hard to extend the model to do that).

Nigel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suspend2-devel] Reboot problem

2008-01-01 Thread Nigel Cunningham
Hi Christian.

Christian Hesse wrote:
 On Tuesday 01 January 2008, Nigel Cunningham wrote:
 Third, regarding the patch itself, I'm taking my time in working towards
 the 3.0 release. We don't have any major bugs with 3.0-rc3 reported [...].
 
 Well, I think I still have a bug, though it is possibly a mainline problem 
 and 
 it's not a showstopper. After a suspend/resume cycle the reboot does not 
 work. The system hangs with Rebooting system (or similar). After that you 
 have to hard reset the system, which is not really a problem as filesystems 
 have been unmounted before. Reboot without a suspend cycle before and halt 
 with and without suspend cycle work without problems.

Just to clarify, do you mean rebooting after writing an image, or
shutting down and rebooting? It could be that there's some change to the
semantics in 2.6.24 that I haven't noticed yet.

 I'm using toi 3.0-rc3 with kernel 2.6.24-rc6 and beside the problem described 
 above I'm really happy with toi.
 
 Happy new your to everybody!

And to you too!

Nigel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suspend2-devel] Freezing filesystems (Was Re: What's in store for 2008 for TuxOnIce?)

2008-01-01 Thread Nigel Cunningham
Hi Ted.

Theodore Tso wrote:
 On Wed, Jan 02, 2008 at 10:54:18AM +1100, Nigel Cunningham wrote:
 I would also like the TuxOnIce issues related to drivers, ACPI, etc. to go 
 to
 one of the kernel-related lists, but I think linux-pm may be better for that
 due to the much lower traffic.
 I guess that makes sense. I guess people can always be referred to LKML
 for the issues where the appropriate person isn't on linux-pm.
 
 Hi Nigel,
 
 I'd really recommend pushing the TuxOnIce discussions to LKML.  That
 way people can see the size of the user community and Andrew and Linus
 can see how many people are using TuxOnIce.  They can also see how
 well the TuxOnIce community helps address user problems, which is a
 big consideration when Linus decides whether or not to merge a
 particular technology. 
 
 If the goal is eventual merger of TuxOnIce, LKML is really the best
 place to have the discussions.  Examples such as Realtime, CFS, and
 others have shown that you really want to keep the discussion front
 and center.  When one developer says, not my problem; my code is
 perfect, and the other developer is working with users who report
 problems, guess which technology generally ends up getting merged by
 Linus?

Yes. The goal is eventual merger. That's what I was thinking too. Thanks
for the input!

Nigel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suspend2-devel] Freezing filesystems (Was Re: What's in store for 2008 for TuxOnIce?)

2008-01-02 Thread Nigel Cunningham
Hi.

Rafael J. Wysocki wrote:
 On Wednesday, 2 of January 2008, Theodore Tso wrote:
 On Wed, Jan 02, 2008 at 10:54:18AM +1100, Nigel Cunningham wrote:
 I would also like the TuxOnIce issues related to drivers, ACPI, etc. to go 
 to
 one of the kernel-related lists, but I think linux-pm may be better for 
 that
 due to the much lower traffic.
 I guess that makes sense. I guess people can always be referred to LKML
 for the issues where the appropriate person isn't on linux-pm.
 Hi Nigel,

 I'd really recommend pushing the TuxOnIce discussions to LKML.
 
 CCing linux-pm (or even linux-acpi) on problem reports would still be
 recommended, though. :-)

Right. And that may make things easier as far as TuxOnIce users go too.
I have one user who currently subscribes to suspend2-users who already
tried subscribing to LKML and said he didn't like the experience. Using
linux-pm instead would save some pain there.

Nigel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-02 Thread Nigel Cunningham
Hi.

Pavel Machek wrote:
 Hi!
 
 So how do you handle threads that are blocked on I/O or a lock  
 during the system freeze process, then?
 We wait until they can continue.
 So if I have a process blocked on an unavilable NFS mount, I can't
 suspend?
 That's correct, you can't.

 [And I know what you're going to say. ;-)]
 Why exactly does suspend/hibernation depend on TASK_INTERRUPTIBLE  
 instead of a zero preempt_count()?  Really what we should do is just  
 iterate over all of the actual physical devices and tell each one  
 Block new IO requests preemptably, finish pending DMA, put the  
 hardware in low-power mode, and prepare for suspend/hibernate.  As  
 long as each driver knows how to do those simple things we can have  
 an entirely consistent kernel image for both suspend and for  
 hibernation.
 
 each driver means this is a lot of work. But yes, that is probably
 way to go, and patch would be welcome.

Yes, that does work. It's what I've done in my (preliminary) support for
fuse.

Regards,

Nigel

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suspend2-users] [Suspend2-devel] Freezing filesystems (Was Re: What's in store for 2008 for TuxOnIce?)

2008-01-02 Thread Nigel Cunningham
Hi Martin.

Martin Steigerwald wrote:
 Am Mittwoch 02 Januar 2008 schrieb Nigel Cunningham:
 Hi.
 
 Hi,
 
 Rafael J. Wysocki wrote:
 On Wednesday, 2 of January 2008, Theodore Tso wrote:
 On Wed, Jan 02, 2008 at 10:54:18AM +1100, Nigel Cunningham
 wrote:
 I would also like the TuxOnIce issues related to drivers,
 ACPI, etc. to go to one of the kernel-related lists, but I
 think linux-pm may be better for that due to the much lower
 traffic.
 I guess that makes sense. I guess people can always be
 referred to LKML for the issues where the appropriate person
 isn't on linux-pm.
 Hi Nigel,
 
 I'd really recommend pushing the TuxOnIce discussions to LKML.
 CCing linux-pm (or even linux-acpi) on problem reports would
 still be recommended, though. :-)
 Right. And that may make things easier as far as TuxOnIce users go
 too. I have one user who currently subscribes to suspend2-users who
 already tried subscribing to LKML and said he didn't like the
 experience. Using linux-pm instead would save some pain there.
 
 I am a bit reluctant about LKML from some of the discussions I have
 seen there and participated in during CFS / CK discussion. I really
 didn't like the tone. Its one thing to say ones own oppinion, another
 one to bash at each other as if there was no tomorrow.
 
 This has been refreshingly different on tuxonice mailing lists. I am
 also a bit reluctant about the traffic. I already have some quite
 high traffic mailinglists with 3-4 mails a year, but LKML
 would top these easily I guess and I am not that sure I want to put
 that load on my mail infrastructure to follow TuxOnIce developments.
 I think this is a generic problem for testers of specific kernel
 subsystems...
 
 But then LKML is were TuxOnIce is visible to the kernel developer 
 community.
 
 I would appreciate linux-pm I think maybe with a guideline to CC to
 LKML in usual cases...

Thanks for your feedback. I think that's the way to go.

 BTW: toi-3.0-rc3 is rocking along nicely on my two ThinkPads (T42 and
  T23)... I am using 2.6.23.12 with cfs-v24.1...

Great to hear!

Nigel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-03 Thread Nigel Cunningham
Hi.

Rafael J. Wysocki wrote:
 On Wednesday, 2 of January 2008, Nigel Cunningham wrote:
 Pavel Machek wrote:
 So how do you handle threads that are blocked on I/O or a lock  
 during the system freeze process, then?
 We wait until they can continue.
 So if I have a process blocked on an unavilable NFS mount, I can't
 suspend?
 That's correct, you can't.

 [And I know what you're going to say. ;-)]
 Why exactly does suspend/hibernation depend on TASK_INTERRUPTIBLE  
 instead of a zero preempt_count()?  Really what we should do is just  
 iterate over all of the actual physical devices and tell each one  
 Block new IO requests preemptably, finish pending DMA, put the  
 hardware in low-power mode, and prepare for suspend/hibernate.  As  
 long as each driver knows how to do those simple things we can have  
 an entirely consistent kernel image for both suspend and for  
 hibernation.
 each driver means this is a lot of work. But yes, that is probably
 way to go, and patch would be welcome.
 Yes, that does work. It's what I've done in my (preliminary) support for
 fuse.
 
 Hmm, can you please elaborate a bit?

Sorry. I wasn't very unambiguous, was I? And I'm not sure now whether
you're meaning How does fuse support relate to freezing block devices?
or What's this about fuse support?. Let me therefore seek to answer
both questions:

Higher level, I know (filesystems rather than block devices), but I was
meaning the general concept of blocking new requests and completing
existing ones worked fine for the supposedly impossible fuse support.

Re fuse support, let me start by saying I know this doesn't handle all
situations, but I think it's a good enough proof-of-concept implementation.

I added some simple hooks to the code for submitting new work to fuse
threads.

#define FUSE_MIGHT_FREEZE(superblock, desc) \
do { \
   int printed = 0; \
   while(superblock-s_frozen != SB_UNFROZEN) { \
   if (!printed) { \
   printk(%d frozen in  desc .\n, current-pid); \
   printed = 1; \
   } \
   try_to_freeze(); \
   yield(); \
   } \
} while (0)

On top of this, I made a (too simple at the moment) freeze_filesystems
function which iterates through super_blocks in reverse order, freezing
fuse filesystems or ordinary ones. I say 'too simple' because it doesn't
currently allow for the possibility of someone mounting (say) ext3 on
fuse, but that would just be an extension of what's already done.

The end result is:

int freeze_processes(void)
{
int error;

printk(KERN_INFO Stopping fuse filesystems.\n);
freeze_filesystems(FS_FREEZER_FUSE);
freezer_state = FREEZER_FILESYSTEMS_FROZEN;
printk(KERN_INFO Freezing user space processes ... );
error = try_to_freeze_tasks(FREEZER_USER_SPACE);
if (error)
goto Exit;
printk(KERN_INFO done.\n);

sys_sync();
printk(KERN_INFO Stopping normal filesystems.\n);
freeze_filesystems(FS_FREEZER_NORMAL);
freezer_state = FREEZER_USERSPACE_FROZEN;
printk(KERN_INFO Freezing remaining freezable tasks ... );
error = try_to_freeze_tasks(FREEZER_KERNEL_THREADS);
if (error)
goto Exit;
printk(KERN_INFO done.);
freezer_state = FREEZER_FULLY_ON;
 Exit:
BUG_ON(in_atomic());
printk(\n);
return error;
}

Sorry if that's more info than you wanted.

Nigel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-03 Thread Nigel Cunningham
Hi.

Oliver Neukum wrote:
 Am Donnerstag 03 Januar 2008 schrieb Nigel Cunningham:
 On top of this, I made a (too simple at the moment) freeze_filesystems
 function which iterates through super_blocks in reverse order, freezing
 fuse filesystems or ordinary ones. I say 'too simple' because it doesn't
 currently allow for the possibility of someone mounting (say) ext3 on
 fuse, but that would just be an extension of what's already done.
 
 How do you deal with fuse server tasks using other fuse filesystems?

Since they're frozen in reverse order, the dependant one would be frozen
first.

 How does freeze_filesystems() look?

Removing my ugly debugging statements, it's currently:

/**
 * freeze_filesystems - lock all filesystems and force them into a
consistent
 * state
 */
void freeze_filesystems(int which)
{
struct super_block *sb;

lockdep_off();

/*
 * Freeze in reverse order so filesystems dependant upon others are
 * frozen in the right order (eg. loopback on ext3).
 */
list_for_each_entry_reverse(sb, super_blocks, s_list) {
if (sb-s_type-fs_flags  FS_IS_FUSE 
sb-s_frozen == SB_UNFROZEN 
which  FS_FREEZER_FUSE) {
sb-s_frozen = SB_FREEZE_TRANS;
sb-s_flags |= MS_FROZEN;
continue;
}

if (!sb-s_root || !sb-s_bdev ||
(sb-s_frozen == SB_FREEZE_TRANS) ||
(sb-s_flags  MS_RDONLY) ||
(sb-s_flags  MS_FROZEN) ||
!(which  FS_FREEZER_NORMAL))
continue;
freeze_bdev(sb-s_bdev);
sb-s_flags |= MS_FROZEN;
}

lockdep_on();
}

Nigel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-03 Thread Nigel Cunningham
Hi.

Oliver Neukum wrote:
 Am Donnerstag, 3. Januar 2008 10:52:53 schrieb Nigel Cunningham:
 Hi.

 Oliver Neukum wrote:
 Am Donnerstag 03 Januar 2008 schrieb Nigel Cunningham:
 On top of this, I made a (too simple at the moment) freeze_filesystems
 function which iterates through super_blocks in reverse order, freezing
 fuse filesystems or ordinary ones. I say 'too simple' because it doesn't
 currently allow for the possibility of someone mounting (say) ext3 on
 fuse, but that would just be an extension of what's already done.
 How do you deal with fuse server tasks using other fuse filesystems?
 Since they're frozen in reverse order, the dependant one would be frozen
 first.
 
 Say I do:
 
 a) mount fuse on /tmp/first
 b) mount fuse on /tmp/second
 
 Then the server task for (a) does ls /tmp/second. So it will be frozen,
 right? How do you then freeze (a)? And keep in mind that the server task
 may have forked.

I guess I should first ask, is this a real life problem or a
hypothetical twisted web? I don't see why you would want to make two
filesystems interdependent - it sounds like the way to create livelock
and deadlocks in normal use, before we even begin to think about
hibernating.

Regards,

Nigel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-05 Thread Nigel Cunningham
Hi.

Pavel Machek wrote:
 On Fri 2008-01-04 21:54:06, Oliver Neukum wrote:
 Am Donnerstag, 3. Januar 2008 23:06:07 schrieb Nigel Cunningham:
 Oliver Neukum wrote:
 Am Donnerstag, 3. Januar 2008 10:52:53 schrieb Nigel Cunningham:
 Oliver Neukum wrote:
 Am Donnerstag 03 Januar 2008 schrieb Nigel Cunningham:
 On top of this, I made a (too simple at the moment) freeze_filesystems
 function which iterates through super_blocks in reverse order, freezing
 fuse filesystems or ordinary ones. I say 'too simple' because it doesn't
 currently allow for the possibility of someone mounting (say) ext3 on
 fuse, but that would just be an extension of what's already done.
 How do you deal with fuse server tasks using other fuse filesystems?
 Since they're frozen in reverse order, the dependant one would be frozen
 first.
 Say I do:

 a) mount fuse on /tmp/first
 b) mount fuse on /tmp/second

 Then the server task for (a) does ls /tmp/second. So it will be frozen,
 right? How do you then freeze (a)? And keep in mind that the server task
 may have forked.
 I guess I should first ask, is this a real life problem or a
 hypothetical twisted web? I don't see why you would want to make two
 filesystems interdependent - it sounds like the way to create livelock
 and deadlocks in normal use, before we even begin to think about
 hibernating.
 Good questions. I personally don't use fuse, but I do care about power
 management. The problem I see is that an unprivileged user could make
 that dependency, even inadvertedly.
 
 Other problem is that unprivileged user can do it with evil intent. So
 called denial-of-service attack.

Only in this case it would be a denial-of-denial-of-service attack,
since it would stop you hibernating or suspending :).

This is still all hypothetical. If I could have a real life case where
this could actually happen, it would help a lot.

Nigel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in evdev_disconnect for kernel 2.6.23.12

2008-01-05 Thread Nigel Cunningham
Hi.

Berthold Cogel wrote:
 Al Viro schrieb:
 On Tue, Jan 01, 2008 at 08:26:05PM +0100, Berthold Cogel wrote:

 Jan  1 17:34:39 wonderland kernel: BUG: unable to handle kernel
 paging request at virtual address 00100100

 LIST_POISON1

 Jan  1 17:34:39 wonderland kernel: EIP is at evdev_disconnect+0x65/0x9e 

 and by the look of code, it's a bit before the call of something that
 gets
 0x20006 as one of its arguments.  Which, by the look of evdev.s, gets
 passed only to kill_fasync().  So it's POLL_HUP, so this code could be
 these days:
 spin_lock(evdev-client_lock);
 list_for_each_entry(client, evdev-client_list, node)
 kill_fasync(client-fasync, SIGIO, POLL_HUP);
 spin_unlock(evdev-client_lock);
 in evdev_hangup()
 prior to commit 6addb1d6de1968b84852f54561cc9a09b5a9:
 list_for_each_entry(client, evdev-client_list, node)
 kill_fasync(client-fasync, SIGIO, POLL_HUP);
 in evdev_disconnect()


 I'm using Debian stable/testing/unstable with homemade kernel
 2.6.23.12 (patched with tuxonice-3.0-rc3-for-2.6.23.9).

 ... and seeing that this changeset postdates 2.6.23 *and* adds locking to
 the lists we are traversing in either variant, I'd bet that the kernel
 you
 have does *NOT* have the changeset in question, that you have list
 corruption
 from race and that your oops is list_for_each_entry() trying to walk
 forward from entry that just had list_del() poisoning its -next.

 There are only 4 changesets between 2.6.23 and this one affecting
 drivers/input
 and only
 8006479c9b75fb6594a7b746af3d7f1fbb68f18f and
 6addb1d6de1968b84852f54561cc9a09b5a9
 appear to be relevant.  Apply to your kernel and see if it helps...
 
 Looks as if I have to start using git ... I always feared that this day
 will come. ;-)
 
 If I'm able to reproduce the oops with my patched kernel, I will gladly
 follow your advice.
 
 Regards,
 
 Berthold

I can't do it immediately but I'll send you the patches to try a later
in the day if you like.

Nigel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH -mm] Freezer: Do not allow freezing processes to clear TIF_SIGPENDING

2007-10-18 Thread Nigel Cunningham
Hi.

On Friday 19 October 2007 08:22:35 Rafael J. Wysocki wrote:
 From: Rafael J. Wysocki [EMAIL PROTECTED]
 
 Do not allow processes to clear their TIF_SIGPENDING if TIF_FREEZE is set,
 to prevent them from racing with the freezer (like mysqld does, for 
example).
 
 Signed-off-by: Rafael J. Wysocki [EMAIL PROTECTED]

Acked-by: Nigel Cunningham [EMAIL PROTECTED]

 ---
  kernel/signal.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 Index: linux-2.6.23-mm1/kernel/signal.c
 ===
 --- linux-2.6.23-mm1.orig/kernel/signal.c
 +++ linux-2.6.23-mm1/kernel/signal.c
 @@ -124,7 +124,7 @@ void recalc_sigpending_and_wake(struct t
  
  void recalc_sigpending(void)
  {
 - if (!recalc_sigpending_tsk(current))
 + if (!recalc_sigpending_tsk(current)  !freezing(current))
   clear_thread_flag(TIF_SIGPENDING);
  
  }
 



-- 
Nigel, Michelle, Alisdair and  Cunningham
5 Mitchell Street
Cobden 3266
Victoria, Australia
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-09-20 Thread Nigel Cunningham
Hi Andrew.

On Thursday 20 September 2007 20:09:41 Pavel Machek wrote:
 Seems like good enough for -mm to me.
 
   Pavel

Andrew, if I recall correctly, you said a while ago that you didn't want 
another hibernation implementation in the vanilla kernel. If you're going to 
consider merging this kexec code, will you also please consider merging 
TuxOnIce?

Regards,

Nigel
-- 
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-09-20 Thread Nigel Cunningham
Hi.

On Friday 21 September 2007 11:06:23 Andrew Morton wrote:
 On Fri, 21 Sep 2007 10:24:34 +1000 Nigel Cunningham 
[EMAIL PROTECTED] wrote:
 
  Hi Andrew.
  
  On Thursday 20 September 2007 20:09:41 Pavel Machek wrote:
   Seems like good enough for -mm to me.
   
 Pavel
  
  Andrew, if I recall correctly, you said a while ago that you didn't want 
  another hibernation implementation in the vanilla kernel. If you're going 
to 
  consider merging this kexec code, will you also please consider merging 
  TuxOnIce?
  
 
 The theory is that kexec-based hibernation will mainly use preexisting
 kexec code and will permit us to delete the existing hibernation
 implementation.
 
 That's different from replacing it.

TuxOnIce doesn't remove the existing implementation either. It can 
transparently replace it, but you can enable/disable that at compile time.

Regards,

Nigel
-- 
Nigel Cunningham
Christian Reformed Church of Cobden
103 Curdie Street, Cobden 3266, Victoria, Australia
Ph. +61 3 5595 1185 / +61 417 100 574
Communal Worship: 11 am Sunday.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-09-20 Thread Nigel Cunningham
Hi.

On Friday 21 September 2007 11:41:06 Andrew Morton wrote:
  On Friday 21 September 2007 11:06:23 Andrew Morton wrote:
   On Fri, 21 Sep 2007 10:24:34 +1000 Nigel Cunningham 
  [EMAIL PROTECTED] wrote:
   
Hi Andrew.

On Thursday 20 September 2007 20:09:41 Pavel Machek wrote:
 Seems like good enough for -mm to me.
 
   
 Pavel

Andrew, if I recall correctly, you said a while ago that you didn't 
want 
another hibernation implementation in the vanilla kernel. If you're 
going 
  to 
consider merging this kexec code, will you also please consider 
merging 
TuxOnIce?

   
   The theory is that kexec-based hibernation will mainly use preexisting
   kexec code and will permit us to delete the existing hibernation
   implementation.
   
   That's different from replacing it.
  
  TuxOnIce doesn't remove the existing implementation either. It can 
  transparently replace it, but you can enable/disable that at compile time.
 
 Right.  So we end up with two implementations in-tree.  Whereas
 kexec-based-hibernation leads us to having zero implementations in-tree.
 
 See, it's different.

That's not true. Kexec will itself be an implementation, otherwise you'd end 
up with people screaming about no hibernation support. And it won't result in 
the complete removal of the existing hibernation code from the kernel. At the 
very least, it's going to want the kernel being hibernated to have an 
interface by which it can find out which pages need to be saved. I wouldn't 
be surprised if it also ends up with an interface in which the kernel being 
hibernated tells it what bdev/sectors in which to save the image as well 
(otherwise you're going to need a dedicated, otherwise untouched partition 
exclusively for the kexec'd kernel to use), or what network settings to use 
if it wants to try to save the image to a network storage device. On top of 
that, there are all the issues related to device reinitialisation and so on, 
and it looks like there's greatly increased pain for users wanting to 
configure this new implementation. Kexec is by no means proven to be the 
panacea for all the issues.

Regards,

Nigel
-- 
Nigel Cunningham
Pastor
Christian Reformed Church of Cobden
Victoria, Australia
+61 3 5595 1185
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-09-20 Thread Nigel Cunningham
Hi.

On Friday 21 September 2007 12:18:57 Huang, Ying wrote:
  That's not true. Kexec will itself be an implementation, otherwise you'd 
end 
  up with people screaming about no hibernation support. And it won't result 
in 
  the complete removal of the existing hibernation code from the kernel. At 
the 
  very least, it's going to want the kernel being hibernated to have an 
  interface by which it can find out which pages need to be saved. I 
wouldn't 
 
 This has been done by kexec/kdump guys. There is a makedumpfile utility
 and vmcoreinfo kernel mechanism to implement this. We can just reuse the
 work of kexec/kdump.

You've already said that you are currently saving all pages. How are you going 
to avoid saving free pages if you don't get the information from the kernel 
being saved? This will require more than just code reuse.

  be surprised if it also ends up with an interface in which the kernel 
being 
  hibernated tells it what bdev/sectors in which to save the image as well 
  (otherwise you're going to need a dedicated, otherwise untouched partition 
  exclusively for the kexec'd kernel to use), or what network settings to 
use 
  if it wants to try to save the image to a network storage device. On top 
of
 
 These can be done in user space. The image writing will be done in user
 space for kexec base hibernation.

That only complicates things more. Now you need to get the information on 
where to save the image from the kernel being saved, then transfer it to 
userspace after switching to the kexec kernel. That's more kernel code, not 
less.

  that, there are all the issues related to device reinitialisation and so 
on, 
 
 Yes. Device reinitialisation is needed. But all in all, kexec based
 hibernation can be much simpler on the kernel side.

Sorry, but I'm yet to be convinced. I'm not unwilling, I'm just not there yet.
 
  and it looks like there's greatly increased pain for users wanting to 
  configure this new implementation. Kexec is by no means proven to be the 
  panacea for all the issues.
 
 Configuration is a problem, we will work on it.
 
 But, because it is based on kexec/kdump instead of starting from
 scratch, the duplicated part between hibernation and kexec/kdump can be
 eliminated.

Regards,

Nigel
-- 
Nigel, Michelle and Alisdair Cunningham
5 Mitchell Street
Cobden 3266
Victoria, Australia
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-09-20 Thread Nigel Cunningham
Hi.

On Friday 21 September 2007 12:45:57 Huang, Ying wrote:
 On Fri, 2007-09-21 at 12:25 +1000, Nigel Cunningham wrote:
  Hi.
  
  On Friday 21 September 2007 12:18:57 Huang, Ying wrote:
That's not true. Kexec will itself be an implementation, otherwise 
you'd 
  end 
up with people screaming about no hibernation support. And it won't 
result 
  in 
the complete removal of the existing hibernation code from the kernel. 
At 
  the 
very least, it's going to want the kernel being hibernated to have an 
interface by which it can find out which pages need to be saved. I 
  wouldn't 
   
   This has been done by kexec/kdump guys. There is a makedumpfile utility
   and vmcoreinfo kernel mechanism to implement this. We can just reuse the
   work of kexec/kdump.
  
  You've already said that you are currently saving all pages. How are you 
going 
  to avoid saving free pages if you don't get the information from the 
kernel 
  being saved? This will require more than just code reuse.
 
 I have not tried makedumpfile. The makedumpfile avoids saving free
 pages through checking the mem_map of the original kernel. I think
 there is nothing prevent it been used for kexec based hibernation image
 writing.
 
 This is an example of duplicated effort between kexec/kdump and original
 hibernation implementation. Both kexec/kdump and hibernation need to
 save memory image without saving the free pages. This can be done once
 instead of twice.

Ok.

be surprised if it also ends up with an interface in which the kernel 
  being 
hibernated tells it what bdev/sectors in which to save the image as 
well 
(otherwise you're going to need a dedicated, otherwise untouched 
partition 
exclusively for the kexec'd kernel to use), or what network settings 
to 
  use 
if it wants to try to save the image to a network storage device. On 
top 
  of
   
   These can be done in user space. The image writing will be done in user
   space for kexec base hibernation.
  
  That only complicates things more. Now you need to get the information on 
  where to save the image from the kernel being saved, then transfer it to 
  userspace after switching to the kexec kernel. That's more kernel code, 
not 
  less.
 
 This is fairly simple in fact. For example, you can specify the
 bdev/sectors in kernel command line when do kexec load kexec -l ...
 --append='...', then the image writing system can get it through
 cat /proc/cmdline.

Sounds doable, as long as you can cope with long command lines (which 
shouldn't be a biggie). (If you've got a swapfile or parts of a swap 
partition already in use, it can be quite fragmented).

Andrew, you're seeing that it really doesn't mean the removal of all 
hibernation code from the kernel being suspended, aren't you? (And if the 
kexec'd kernel is the same binary, then there's more code again).

Regards,

Nigel
-- 
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-09-21 Thread Nigel Cunningham
Hi.

On Friday 21 September 2007 21:56:29 Rafael J. Wysocki wrote:
 [Besides, the current hibernation userland interface is used by default by
 openSUSE and it's also used by quite some Debian users, so we can't drop
 it overnight and it can't be implemented in a compatible way on top of the
 kexec-based solution.]

Could it be fudged by giving userland a null image and having (say) the first 
ioctl be one that triggers all the real work (with other ioctls being noops 
or such like, as appropriate)?

Regards,

Nigel
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-09-21 Thread Nigel Cunningham
Hi.

On Friday 21 September 2007 22:18:19 Rafael J. Wysocki wrote:
 On Friday, 21 September 2007 13:58, Nigel Cunningham wrote:
  Hi.
  
  On Friday 21 September 2007 21:56:29 Rafael J. Wysocki wrote:
   [Besides, the current hibernation userland interface is used by default 
by
   openSUSE and it's also used by quite some Debian users, so we can't drop
   it overnight and it can't be implemented in a compatible way on top of 
the
   kexec-based solution.]
  
  Could it be fudged by giving userland a null image and having (say) the 
first 
  ioctl be one that triggers all the real work (with other ioctls being 
noops 
  or such like, as appropriate)?
 
 Well, the suspend part is probably doable, but I'm afraid of the resume
 one.

'k. I've occasionally thought about trying it, but haven't ever gotten around 
to actually doing it yet. (I'd like to make TuxOnIce transparently replace 
both swsusp and uswsusp if I could).

Regards,

Nigel
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-09-21 Thread Nigel Cunningham
Hi.

On Saturday 22 September 2007 09:19:18 Kyle Moffett wrote:
 I think that in order for this to work, there would need to be some  
 ABI whereby the resume-ing kernel can pass its entire ACPI state and  
 a bunch of other ACPI-related device details to the resume-ed kernel,  
 which I believe it does not do at the moment.  I believe that what  
 causes problems is the ACPI state data that the kernel stores is  
 *different* between identical sequential boots, especially when you  
 add/remove/replace batteries, AC, etc.

That's certainly possible. We already pass a very small amount of data between 
the boot and resuming kernels at the moment, and it's done quite simply - by 
putting the variables we want to 'transfer' in a nosave page/section. I could 
conceive of a scheme wherein this was extended for driver data. Since the 
memory needed would depend on the drivers loaded, it would probably require 
that the space be allocated when hibernating, and the locations of structures 
be stored in the image header and then drivers notified of the locations to 
use when preparing to resume, but it could work...
 
 Since we currently throw away most of that in-kernel ACPI interpreter  
 state data when we load the to-be-resumed image and replace it with  
 the state from the previous boot it looks to the ACPI code and  
 firmware like our system's hardware magically changed behind its  
 back.  The result is that the ACPI and firmware code is justifiably  
 confused (although probably it should be more idempotent to begin  
 with).  There's 2 potential solutions:
1) Formalize and copy a *lot* of ACPI state from the resume-ing  
 kernel to the resume-ed kernel.
2) Properly call the ACPI S4 methods in the proper order

... that said, I don't think the above should be necessary in most cases. I 
believe we're already calling the ACPI S4 methods in the proper order. If I 
understood correctly, Rafael put a lot of effort into learning what that was, 
and into ensuring it does get done.
 
 Neither one is particularly easy or particularly pleasant, especially  
 given all the vendor bugs in this general area.  Theoretically we  
 should be able to do both, since one will be more reliable than the  
 other on different systems depending on what kinds of firmware bugs  
 they have.

Regards,

Nigel
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-09-26 Thread Nigel Cunningham
Hi.

On Thursday 27 September 2007 06:30:36 Joseph Fannin wrote:
 On Fri, Sep 21, 2007 at 11:45:12AM +0200, Pavel Machek wrote:
  Hi!
   
Sounds doable, as long as you can cope with long command lines (which
shouldn't be a biggie). (If you've got a swapfile or parts of a swap
partition already in use, it can be quite fragmented).
  
   Hmm.  This is an interesting problem.  Sharing a swap file or a swap
   partition with the actual swap of user space pages does seem to be
   a limitation of this approach.
  
   Although the fact that it is simple to write to a separate file may
   be a reasonable compensation.
 
  I'm not sure how you'd write it to a separate file. Notice that kjump
  kernel may not mount journalling filesystems, not even
  read-only. (Ext3 replays journal in that case). You could pass block
  numbers from the original kernel...
 
 The ext3 thing is a bug, the case for which I don't think has been
 adequately explained to the ext[34] folks.  There should be at least a
 no_replay mount flag available, or something.  It has ramifications
 for more than just hibernation.
 
 And yeah, I'm gonna bring up the swap files thing again.  If you
 can hibernate to a swap file, you can hibernate to a dedicated
 hibernation file, and vice versa.
 
 If you can't hibernate to a swap file, then swap files are
 effectively unsupported for any system you might want to hibernate.
 handwave I wonder what embedded folks would think about that
 /handwave.
 
 But, in my ignorance, I'm not sure even fixing the ext3 bug will
 guarantee you consistent metadata so that you can handle a
 swap/hibernate file.  You can do a sync(), but how do you make that
 not race against running processes without the freezer, or blkdev
 snapshots?
 
 I guess uswsusp and the-patch-previously-known-as-suspend2 handle
 this somehow, though.
 
(It's that same ignorance that has me waiting for someone with
 established credit with kernel people to make that argument for the
 ext3 bug, so I can hang my own reasons for thinking that it's bad off
 of theirs).

I haven't looked at swsusp support, but TuxOnIce handles all storage (swap 
partitions, swap files and ordinary files) by first allocating swap (if we're 
using swap), then bmapping the storage we're going to use. After that, we can 
freeze filesystems and processes with impunity. The allocated storage is then 
viewed as just a collection of bdevs, each with an ordered chain of extents 
defining which blocks we're going to read/write - a series of tapes if you 
like. In the image header, we store dev_ts and the block chains, together 
with the configuration information. As long as the same bdevs are configured 
at boot time prior to the echo  /sys/power/resume, we're in business. 
Filesystems don't need to be mounted because we don't use filesystem code 
anyway. (LVM etc does though in so far as it's needed to make the dev_t match 
the device again).

This matches with what you said above about hibernating to swap files and 
dedicated hibernation files - TuxOnIce uses exactly the same code to do the 
i/o to both; the variation is in the code to recognise the image header and 
allocate/free/bmap storage.

not a filesystem expert Personally, I don't think ext[34] is broken. If 
there's data being left in the journal that will need replaying, then 
mounting without replaying the journal sounds wrong. Perhaps you should 
instead be arguing that nothing should be left in the journal after a 
filesystem freeze. But, of course, current code isn't doing a filesystem 
freeze (just a process freeze) and the kexec guys want to take even that 
away. /not a filesystem expert

In short, I agree. AFAICS, you need both the process freezer and filesystem 
freezing to make this thing fly properly.

Nigel
-- 
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 1/2 -mm] kexec based hibernation -v3: kexec jump

2007-09-27 Thread Nigel Cunningham
Hi.

On Thursday 27 September 2007 16:33:54 Huang, Ying wrote:
 On Wed, 2007-09-26 at 16:30 -0400, Joseph Fannin wrote:
  But, in my ignorance, I'm not sure even fixing the ext3 bug will
  guarantee you consistent metadata so that you can handle a
  swap/hibernate file.  You can do a sync(), but how do you make that
  not race against running processes without the freezer, or blkdev
  snapshots?
  
  I guess uswsusp and the-patch-previously-known-as-suspend2 handle
  this somehow, though.
 
 The image-writing kernel of kexec based hibernation run in a controlled
 way. It is not used by normal user, so only really necessary process
 need to be run. For example, it is possible that there is only one user
 process -- the image-writing process running in image-writing kernel.
 So, no freezer or blkdev snapshot is needed.

You're thinking of the wrong kernel - we were talking about prior to switching 
to the kexec'd kernel while suspending.

Regards,

Nigel
-- 
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Fwd: [Suspend2-devel] [patch] 2.2.10.3 build fixes

2007-09-30 Thread Nigel Cunningham
Hi Rafael et al.

This looks like it will be vanilla material, maybe 2.6.23 material?

Regards,

Nigel
--  Forwarded Message  --

Subject: [Suspend2-devel] [patch] 2.2.10.3 build fixes
Date: Sunday 30 September 2007
From: Roman Dubtsov (dubtsov gmail com)

Hi,

I have recently run into build issue with 2.6.22 and tuxonice
2.2.10.3. When building custom kernel with make-kpkg the process
failed with the message saying: fs.h requires linux/freezer.h, which
does not exist in exported headers. Here's quick-n-dirty patch which
fixes this. Hope it is usefull.

--- 2.6.22-toi/include/linux/Kbuild.orig  2007-09-30 01:21:30.0 +0700
+++ 2.6.22-toi/include/linux/Kbuild   2007-09-29 23:52:52.0 +0700
@@ -202,6 +202,7 @@ unifdef-y += filter.h
 unifdef-y += flat.h
 unifdef-y += futex.h
 unifdef-y += fs.h
+unifdef-y += freezer.h
 unifdef-y += gameport.h
 unifdef-y += generic_serial.h
 unifdef-y += genhd.h

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fwd: [Suspend2-devel] [patch] 2.2.10.3 build fixes

2007-09-30 Thread Nigel Cunningham
Hi.

On Monday 01 October 2007 05:56:45 Rafael J. Wysocki wrote:
 Hi,
 
 On Sunday, 30 September 2007 13:44, Nigel Cunningham wrote:
  Hi Rafael et al.
  
  This looks like it will be vanilla material, maybe 2.6.23 material?
 
 Well, I wouldn't like to export freezer.h .  Why exactly would that be
 necessary?

A module that starts a freezeable kthread?

I can ask for more details, and will if you like.

Regards,

Nigel
-- 
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fwd: [Suspend2-devel] [patch] 2.2.10.3 build fixes

2007-10-01 Thread Nigel Cunningham
Hi.

On Monday 01 October 2007 08:28:02 Rafael J. Wysocki wrote:
 On Sunday, 30 September 2007 23:43, Nigel Cunningham wrote:
  On Monday 01 October 2007 05:56:45 Rafael J. Wysocki wrote:
   On Sunday, 30 September 2007 13:44, Nigel Cunningham wrote:
Hi Rafael et al.

This looks like it will be vanilla material, maybe 2.6.23 material?
   
   Well, I wouldn't like to export freezer.h .  Why exactly would that be
   necessary?
  
  A module that starts a freezeable kthread?
  
  I can ask for more details, and will if you like.
 
 Yes, please.

Ah. My bad. I should have looked at it more carefully before forwarding; it's 
a result of my modifications for fuse support.

Sorry for the noise.

Nigel
-- 
See http://www.tuxonice.net for Howtos, FAQs, mailing
lists, wiki and bugzilla info.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >