Re: speeding up swapoff
Daniel Drake <[EMAIL PROTECTED]> writes: > > It's more-or-less a real life problem. We have an interactive > application which, when triggered by the user, performs rendering tasks > which must operate in real-time. In attempt to secure performance, we > want to ensure everything is memory resident and that nothing might be > swapped out during the process. So, we run swapoff at that time. If the system gets under serious memory pressure it'll happily discard your text pages too (and later reload them from disk). The same for any file data you might need to access. swapoff will only affect anonymous memory, but not all the other memory you'll need as well. There's no way around mlock/mlockall() to really prevent this. Still even with that you could still lose dentries/inodes etc which can also cause stalls. The only way to keep them locked is to keep the files always open. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
Daniel Drake wrote: On Wed, 2007-08-29 at 07:30 -0700, Arjan van de Ven wrote: My experiments show that when there is not much free physical memory, swapoff moves pages out of swap at a rate of approximately 5mb/sec. sounds like about disk speed (at random-seek IO pattern) We are only using 'standard' seagate SATA disks, but I would have thought much more performance (40+ mb/sec) would be reachable. before you go there... is this a "real life" problem? Or just a mostly-artificial corner case? (the answer to that obviously is relevant for the 'should we really care' question) It's more-or-less a real life problem. We have an interactive application which, when triggered by the user, performs rendering tasks which must operate in real-time. In attempt to secure performance, we want to ensure everything is memory resident and that nothing might be swapped out during the process. So, we run swapoff at that time. So the real issue isn't that your process doesn't run fast enough without doing swapoff, but that swapoff itself takes too long. If there is a decent number of pages swapped out, the user sits for a while at a 'please wait' screen, which is not desirable. To throw some numbers out there, likely more than a minute for 400mb of swapped pages. Sure, we could run the whole interactive application with swap disabled, which is pretty much what we do. However we have other non-real-time processing tasks which are very memory hungry and do require swap. So, there are 'corner cases' where the user can reach the real-time part of the interactive application when there is a lot of memory swapped out. How much is "a lot?" You said 400MB, you can add a few GB of RAM and eliminate the problem at that size. Run the application in a virtual machine which has enough dedicated memory? I think xen will do that. Run "swap" on a ramdisk? I don't think swapoff was designed as a fast operation, although your performance is pretty leisurely. ;-) I assume you looked at mlock() and it doesn't fit your usage, or you don't control the application behavior, or its limitations make it unsuitable in some other way. Another question, if this is during system shutdown, maybe that's a valid case for flushing most of the pagecache first (from userspace) since most of what's there won't be used again anyway. If that's enough to make this go faster... Shutdown isn't a concern here. A third question, have you investigated what happens if a process gets killed that has pages in swap; as long as we don't page those in but just forget about them, that would solve the shutdown problem nicely (since we kill stuff first anyway there) According to top, those pages in swap disappear when the process is killed. So, I don't think there are any swap-related performance issues on the shutdown path. Thanks. -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
On Thu, 2007-08-30 at 11:36 +0100, Hugh Dickins wrote: > Regarding Daniel's use of swapoff: it's a very heavy sledgehammer > for cracking that nut, I strongly agree with those who have pointed > him to mlock and mlockall instead. There are some issues with us using mlockall. Admittedly, most/all of them are not the kernels problem (but a fast swapoff would be a good workaround): We're using python 2.4, so mlock() itself isn't really an option (we don't realistically have access to the address regions hidden behind the language). mlockall() is a possibility, but the fact that all allocations above a particular limit will fail would potentially cause us problems given that it's hard to control python's memory usage for a long-running application. Additionally, choosing that limit is hard given that we have this real-time and non-real-time processing balance, plus an interactive python-based application that runs all the time (which is the thing we would be locking). python 2.4 never returns memory to the OS, so at whatever point the memory usage of the application peaks, all that memory remains locked permanently. In addition we have the non-real-time processing task which does benefit from having more memory available, so in that case, we would want it to swap out parts of the application. I guess we could ask the application to do munlockall() here, but things start getting scary and overcomplicated at this point... So, our arguments against mlockall() are not strong, but you can see why fast swapoff would be mighty convenient. Thanks for all the info so far. It does sound like my earlier idea wouldn't be any faster in the general case due to excess disk seeking. Oh well... -- Daniel Drake Brontes Technologies, A 3M Company http://www.brontes3d.com/opensource - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
On Thu, 2007-08-30 at 16:06 +0200, Helge Hafting wrote: > Xavier Bestel wrote: > > On Thu, 2007-08-30 at 15:55 +0200, Helge Hafting wrote: > > > >> If the swap device is full, then there is no need for random > >> seeks as the swap pages can be read in disk order. > >> > > > > If the swap file is full, you probably have a machine dead into a swap > > storm. > Only if you have enough swap. :-) Yeah, sure. But these days disk space is cheap and I tend to put too big swap partitions, and I always regret it later ... Xav - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
Xavier Bestel wrote: On Thu, 2007-08-30 at 15:55 +0200, Helge Hafting wrote: If the swap device is full, then there is no need for random seeks as the swap pages can be read in disk order. If the swap file is full, you probably have a machine dead into a swap storm. Only if you have enough swap. :-) Helge Hafting - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
On Thu, 2007-08-30 at 15:55 +0200, Helge Hafting wrote: > If the swap device is full, then there is no need for random > seeks as the swap pages can be read in disk order. If the swap file is full, you probably have a machine dead into a swap storm. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
Robert Hancock wrote: Daniel Drake wrote: On Wed, 2007-08-29 at 07:30 -0700, Arjan van de Ven wrote: My experiments show that when there is not much free physical memory, swapoff moves pages out of swap at a rate of approximately 5mb/sec. sounds like about disk speed (at random-seek IO pattern) We are only using 'standard' seagate SATA disks, but I would have thought much more performance (40+ mb/sec) would be reachable. Not if it is doing random seeks.. If the swap device is full, then there is no need for random seeks as the swap pages can be read in disk order. A not so full swap will skip over the unused areas, the time needed should still be limited to the time needed for reading the whole swap device. If this optimization is worth it is another problem though. Helge Hafting - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
On Thu, 30 Aug 2007, Eric W. Biederman wrote: > > There is one other possibility. Typically the swap code is using > compatibility disk I/O functions instead of the best the kernel > can offer. I haven't looked recently but it might be worth just > making certain that there isn't some low-level optimization or > cleanup possible on that path. Although I may just be thinking > of swapfiles. Andrew rewrote swapfile support in 2.5, making it use FIBMAP at swapon time: so that in 2.6 swapfiles are as deadlock-free and as efficient (unless the swapfile happens to be badly fragmented) as raw disk partitions. There's certainly scope for a study of I/O patterns in swapping, it's hard to imagine that improvements couldn't be made (but also easy to imagine endless disputes over different kinds of workload). But most people would appreciate an improvement in active swapping, and not care very much about the swapoff. Regarding Daniel's use of swapoff: it's a very heavy sledgehammer for cracking that nut, I strongly agree with those who have pointed him to mlock and mlockall instead. Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
Hugh Dickins <[EMAIL PROTECTED]> writes: > The speedups I've imagined making, were a need demonstrated, have > been more on the lines of batching (dealing with a range of pages > in one go) and hashing (using the swapmap's ushort, so often 1 or > 2 or 3, to hold an indicator of where to look for its references). There is one other possibility. Typically the swap code is using compatibility disk I/O functions instead of the best the kernel can offer. I haven't looked recently but it might be worth just making certain that there isn't some low-level optimization or cleanup possible on that path. Although I may just be thinking of swapfiles. I know there were tremendous gains ago when I removed the functions that wrote pages synchronously to swapfiles. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
Daniel Drake wrote: On Wed, 2007-08-29 at 07:30 -0700, Arjan van de Ven wrote: My experiments show that when there is not much free physical memory, swapoff moves pages out of swap at a rate of approximately 5mb/sec. sounds like about disk speed (at random-seek IO pattern) We are only using 'standard' seagate SATA disks, but I would have thought much more performance (40+ mb/sec) would be reachable. Not if it is doing random seeks.. before you go there... is this a "real life" problem? Or just a mostly-artificial corner case? (the answer to that obviously is relevant for the 'should we really care' question) It's more-or-less a real life problem. We have an interactive application which, when triggered by the user, performs rendering tasks which must operate in real-time. In attempt to secure performance, we want to ensure everything is memory resident and that nothing might be swapped out during the process. So, we run swapoff at that time. If there is a decent number of pages swapped out, the user sits for a while at a 'please wait' screen, which is not desirable. To throw some numbers out there, likely more than a minute for 400mb of swapped pages. Sure, we could run the whole interactive application with swap disabled, which is pretty much what we do. However we have other non-real-time processing tasks which are very memory hungry and do require swap. So, there are 'corner cases' where the user can reach the real-time part of the interactive application when there is a lot of memory swapped out. Normally mlockall is what is used in this sort of situation, that way it doesn't force all swapped data in for every app. It's possible that calling this with lots of swapped pages in the app at the time may have the same problem though. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
Am Mittwoch 29 August 2007 schrieb Hugh Dickins: > On Wed, 29 Aug 2007, Oliver Neukum wrote: > > Am Mittwoch 29 August 2007 schrieb Arjan van de Ven: > > > Another question, if this is during system shutdown, maybe that's a > > > valid case for flushing most of the pagecache first (from userspace) > > > since most of what's there won't be used again anyway. If that's enough > > > to make this go faster... > > > > Is there a good reason to swapoff during shutdown? > > Three reasons, I think, only one of them compelling: > > 1. Tidiness. > 2. So swapoff gets testing and I get to hear of any bugs in it. > 3. If a regular swapfile is used instead of a disk partition, you > need to swapoff before its filesystem can be unmounted cleanly. Yes. I hadn't thought of that. I am using a dedicated disk. Regards Oliver - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
On Wednesday 29 August 2007 16:44, Daniel Drake wrote: > On Wed, 2007-08-29 at 07:30 -0700, Arjan van de Ven wrote: > > > My experiments show that when there is not much free physical memory, > > > swapoff moves pages out of swap at a rate of approximately 5mb/sec. > > > > sounds like about disk speed (at random-seek IO pattern) > > We are only using 'standard' seagate SATA disks, but I would have > thought much more performance (40+ mb/sec) would be reachable. > > > before you go there... is this a "real life" problem? Or just a > > mostly-artificial corner case? (the answer to that obviously is > > relevant for the 'should we really care' question) > > It's more-or-less a real life problem. We have an interactive > application which, when triggered by the user, performs rendering tasks > which must operate in real-time. In attempt to secure performance, we > want to ensure everything is memory resident and that nothing might be > swapped out during the process. So, we run swapoff at that time. Did you play with mlock()? Juergen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
On Wed, 2007-08-29 at 09:29 -0400, Daniel Drake wrote: > Hi, > > I've spent some time trying to understand why swapoff is such a slow > operation. > > My experiments show that when there is not much free physical memory, > swapoff moves pages out of swap at a rate of approximately 5mb/sec. When > there is a lot of free physical memory, it is faster but still a slow > CPU-intensive operation, purging swap at about 20mb/sec. > > I've read into the swap code and I have some understanding that this is > an expensive operation (and has to be). This page was very helpful and > also agrees: > http://kernel.org/doc/gorman/html/understand/understand014.html > > After reading that, I have an idea for a possible optimization. If we > were to create a system call to disable ALL swap partitions (or modify > the existing one to accept NULL for that purpose), could this process be > signficantly less complex? > > I'm thinking we could do something like this: > 1. Prevent any more pages from being swapped out from this point > 2. Iterate through all process page tables, paging all swapped > pages back into physical memory and updating PTEs > 3. Clear all swap tables and caches > > Due to only iterating through process page tables once, does this sound > like it would increase performance non-trivially? Is it feasible? > > I'm happy to spend a few more hours looking into implementing this but > would greatly appreciate any advice from those in-the-know on if my > ideas are broken to start with... Daniel: in a response, Juergen Beisert asked if you'd tried mlock() [mlockall() would probably be a better choice] to lock your application into memory. That would require modifying the application. Don't know if you want to do that. Back in Feb'07, I posted an RFC regarding [optionally] inheriting mlockall() semantics across fork and exec. The original posting is here: http://marc.info/?l=linux-mm&m=117217855508612&w=4 The patch is quite stale now [against 20-rc], but shouldn't be too much work to rebase to something more recent. The patch description points to an ad hoc mlock "prefix command" that would allow you to: mlock and run the application as if it had called "mlockall(MCL_CURRENT| MCL_FUTURE)", without having to modify the application--if that's something you can't or don't want to do. Maybe this would help? Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
On Wed, 29 Aug 2007, Oliver Neukum wrote: > Am Mittwoch 29 August 2007 schrieb Arjan van de Ven: > > Another question, if this is during system shutdown, maybe that's a > > valid case for flushing most of the pagecache first (from userspace) > > since most of what's there won't be used again anyway. If that's enough > > to make this go faster... > > Is there a good reason to swapoff during shutdown? Three reasons, I think, only one of them compelling: 1. Tidiness. 2. So swapoff gets testing and I get to hear of any bugs in it. 3. If a regular swapfile is used instead of a disk partition, you need to swapoff before its filesystem can be unmounted cleanly. Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
On Wed, 29 Aug 2007, Arjan van de Ven wrote: > On Wed, 29 Aug 2007 09:29:32 -0400 > Daniel Drake <[EMAIL PROTECTED]> wrote: > > > I've spent some time trying to understand why swapoff is such a slow > > operation. > > > > My experiments show that when there is not much free physical memory, > > swapoff moves pages out of swap at a rate of approximately 5mb/sec. > > sounds like about disk speed (at random-seek IO pattern) The present method should be reading sequentially (with gaps), rather than randomly. Perhaps we need to check what's happening in practice. (I've often dithered over whether we should be doing swap readahead there or not: at present it does not, preferring to assume buffering at the hardware level, and last time I checked that worked out a little better.) > Another question, if this is during system shutdown, maybe that's a > valid case for flushing most of the pagecache first (from userspace) > since most of what's there won't be used again anyway. If that's enough > to make this go faster... (I didn't understand your point there, but Daniel has replied that it's not at shutdown anyway.) > A third question, have you investigated what happens if a process gets > killed that has pages in swap; as long as we don't page those in but > just forget about them, that would solve the shutdown problem nicely > (since we kill stuff first anyway there) We definitely don't page those in, it would be a disaster for process exit if we did: they just get discarded. As you say, shutdown is rarely a big issue, because almost all the processes which had stuff in swap have already been killed. tmpfs use of swap can be an issue there, but if the distro is wise, it'll do things in such an order that tmpfs'es are unmounted before swapoff (but may need two passes: the opposite case is a regular swapfile, where we need to swapoff before that partition can be unmounted). Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
On Wed, 29 Aug 2007, Daniel Drake wrote: > > I've spent some time trying to understand why swapoff is such a slow > operation. > > My experiments show that when there is not much free physical memory, > swapoff moves pages out of swap at a rate of approximately 5mb/sec. When > there is a lot of free physical memory, it is faster but still a slow > CPU-intensive operation, purging swap at about 20mb/sec. Yes, it can be shamefully slow. But we've done nothing about it for years, simply because very few actually suffer from its worst cases. You're the first I've heard complain about it in a long time: perhaps you'll be joined by a chorus, and we can have fun looking at it again. > > I've read into the swap code and I have some understanding that this is > an expensive operation (and has to be). This page was very helpful and > also agrees: > http://kernel.org/doc/gorman/html/understand/understand014.html > > After reading that, I have an idea for a possible optimization. If we > were to create a system call to disable ALL swap partitions (or modify > the existing one to accept NULL for that purpose), could this process be > signficantly less complex? I'd be quite strongly against an additional system call: if we're going to speed it up, let's speed up the common case, not your special additional call. But I don't think you need that anyway: the slowness doesn't come from the limited number of swap areas, but from the much greater numbers of processes and their pages. Looping over the number of swap areas (so often 1) isn't a problem. > > I'm thinking we could do something like this: > 1. Prevent any more pages from being swapped out from this point > 2. Iterate through all process page tables, paging all swapped > pages back into physical memory and updating PTEs > 3. Clear all swap tables and caches > > Due to only iterating through process page tables once, does this sound > like it would increase performance non-trivially? Is it feasible? I'll ignore your steps 1 and 3, I don't see the advantage. (We do already prevent pages from being swapped out to the area we're swapping off, and in general we need to allow for swapping out to another area while swapping off.) Step 2 is the core of your idea. Feasible yes, and very much less CPU-intensive than the present method. But... it would be reading in pages from swap in pretty much a random order, whereas the present method is reading them in sequentially, to minimize disk seek time. So I doubt your way would actually work out faster, except in those (exceptional, I'm afraid) cases where almost all the swap pages are already in core swapcache when swapoff begins. > > I'm happy to spend a few more hours looking into implementing this but > would greatly appreciate any advice from those in-the-know on if my > ideas are broken to start with... Well, do give it a try if you're interested: I've never actually timed doing it that way, and might be surprised. I doubt you could actually remove the present code, but it could become a fallback to clear up the loose ends after some faster first pass. Don't forget you'll also need to deal with tmpfs files (mm/shmem.c): Christoph Rohland long ago had a patch to work on those in the way you propose, but we never integrated it because of the random seek issue. The speedups I've imagined making, were a need demonstrated, have been more on the lines of batching (dealing with a range of pages in one go) and hashing (using the swapmap's ushort, so often 1 or 2 or 3, to hold an indicator of where to look for its references). Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
On Wed, 2007-08-29 at 07:30 -0700, Arjan van de Ven wrote: > > My experiments show that when there is not much free physical memory, > > swapoff moves pages out of swap at a rate of approximately 5mb/sec. > > sounds like about disk speed (at random-seek IO pattern) We are only using 'standard' seagate SATA disks, but I would have thought much more performance (40+ mb/sec) would be reachable. > before you go there... is this a "real life" problem? Or just a > mostly-artificial corner case? (the answer to that obviously is > relevant for the 'should we really care' question) It's more-or-less a real life problem. We have an interactive application which, when triggered by the user, performs rendering tasks which must operate in real-time. In attempt to secure performance, we want to ensure everything is memory resident and that nothing might be swapped out during the process. So, we run swapoff at that time. If there is a decent number of pages swapped out, the user sits for a while at a 'please wait' screen, which is not desirable. To throw some numbers out there, likely more than a minute for 400mb of swapped pages. Sure, we could run the whole interactive application with swap disabled, which is pretty much what we do. However we have other non-real-time processing tasks which are very memory hungry and do require swap. So, there are 'corner cases' where the user can reach the real-time part of the interactive application when there is a lot of memory swapped out. > Another question, if this is during system shutdown, maybe that's a > valid case for flushing most of the pagecache first (from userspace) > since most of what's there won't be used again anyway. If that's enough > to make this go faster... Shutdown isn't a concern here. > A third question, have you investigated what happens if a process gets > killed that has pages in swap; as long as we don't page those in but > just forget about them, that would solve the shutdown problem nicely > (since we kill stuff first anyway there) According to top, those pages in swap disappear when the process is killed. So, I don't think there are any swap-related performance issues on the shutdown path. Thanks. -- Daniel Drake Brontes Technologies, A 3M Company http://www.brontes3d.com/opensource - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
Am Mittwoch 29 August 2007 schrieb Arjan van de Ven: > Another question, if this is during system shutdown, maybe that's a > valid case for flushing most of the pagecache first (from userspace) > since most of what's there won't be used again anyway. If that's enough > to make this go faster... Is there a good reason to swapoff during shutdown? Regards Oliver - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: speeding up swapoff
On Wed, 29 Aug 2007 09:29:32 -0400 Daniel Drake <[EMAIL PROTECTED]> wrote: Hi, > I've spent some time trying to understand why swapoff is such a slow > operation. > > My experiments show that when there is not much free physical memory, > swapoff moves pages out of swap at a rate of approximately 5mb/sec. sounds like about disk speed (at random-seek IO pattern) > I'm happy to spend a few more hours looking into implementing this but > would greatly appreciate any advice from those in-the-know on if my > ideas are broken to start with... before you go there... is this a "real life" problem? Or just a mostly-artificial corner case? (the answer to that obviously is relevant for the 'should we really care' question) Another question, if this is during system shutdown, maybe that's a valid case for flushing most of the pagecache first (from userspace) since most of what's there won't be used again anyway. If that's enough to make this go faster... A third question, have you investigated what happens if a process gets killed that has pages in swap; as long as we don't page those in but just forget about them, that would solve the shutdown problem nicely (since we kill stuff first anyway there) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
speeding up swapoff
Hi, I've spent some time trying to understand why swapoff is such a slow operation. My experiments show that when there is not much free physical memory, swapoff moves pages out of swap at a rate of approximately 5mb/sec. When there is a lot of free physical memory, it is faster but still a slow CPU-intensive operation, purging swap at about 20mb/sec. I've read into the swap code and I have some understanding that this is an expensive operation (and has to be). This page was very helpful and also agrees: http://kernel.org/doc/gorman/html/understand/understand014.html After reading that, I have an idea for a possible optimization. If we were to create a system call to disable ALL swap partitions (or modify the existing one to accept NULL for that purpose), could this process be signficantly less complex? I'm thinking we could do something like this: 1. Prevent any more pages from being swapped out from this point 2. Iterate through all process page tables, paging all swapped pages back into physical memory and updating PTEs 3. Clear all swap tables and caches Due to only iterating through process page tables once, does this sound like it would increase performance non-trivially? Is it feasible? I'm happy to spend a few more hours looking into implementing this but would greatly appreciate any advice from those in-the-know on if my ideas are broken to start with... Thanks! -- Daniel Drake Brontes Technologies, A 3M Company http://www.brontes3d.com/opensource - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/