Re: speeding up swapoff

2007-09-01 Thread Andi Kleen
Daniel Drake <[EMAIL PROTECTED]> writes:
> 
> It's more-or-less a real life problem. We have an interactive
> application which, when triggered by the user, performs rendering tasks
> which must operate in real-time. In attempt to secure performance, we
> want to ensure everything is memory resident and that nothing might be
> swapped out during the process. So, we run swapoff at that time.

If the system gets under serious memory pressure it'll happily discard
your text pages too (and later reload them from disk). The same
for any file data you might need to access.

swapoff will only affect anonymous memory, but not all the other
memory you'll need as well.

There's no way around mlock/mlockall() to really prevent this.

Still even with that you could still lose dentries/inodes etc which
can also cause stalls. The only way to keep them locked
is to keep the files always open.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-09-01 Thread Andi Kleen
Daniel Drake [EMAIL PROTECTED] writes:
 
 It's more-or-less a real life problem. We have an interactive
 application which, when triggered by the user, performs rendering tasks
 which must operate in real-time. In attempt to secure performance, we
 want to ensure everything is memory resident and that nothing might be
 swapped out during the process. So, we run swapoff at that time.

If the system gets under serious memory pressure it'll happily discard
your text pages too (and later reload them from disk). The same
for any file data you might need to access.

swapoff will only affect anonymous memory, but not all the other
memory you'll need as well.

There's no way around mlock/mlockall() to really prevent this.

Still even with that you could still lose dentries/inodes etc which
can also cause stalls. The only way to keep them locked
is to keep the files always open.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-30 Thread Bill Davidsen

Daniel Drake wrote:

On Wed, 2007-08-29 at 07:30 -0700, Arjan van de Ven wrote:

My experiments show that when there is not much free physical memory,
swapoff moves pages out of swap at a rate of approximately 5mb/sec.

sounds like about disk speed (at random-seek IO pattern)


We are only using 'standard' seagate SATA disks, but I would have
thought much more performance (40+ mb/sec) would be reachable.


before you go there... is this a "real life" problem? Or just a
mostly-artificial corner case? (the answer to that obviously is
relevant for the 'should we really care' question)


It's more-or-less a real life problem. We have an interactive
application which, when triggered by the user, performs rendering tasks
which must operate in real-time. In attempt to secure performance, we
want to ensure everything is memory resident and that nothing might be
swapped out during the process. So, we run swapoff at that time.


So the real issue isn't that your process doesn't run fast enough 
without doing swapoff, but that swapoff itself takes too long.


If there is a decent number of pages swapped out, the user sits for a
while at a 'please wait' screen, which is not desirable. To throw some
numbers out there, likely more than a minute for 400mb of swapped pages.

Sure, we could run the whole interactive application with swap disabled,
which is pretty much what we do. However we have other non-real-time
processing tasks which are very memory hungry and do require swap. So,
there are 'corner cases' where the user can reach the real-time part of
the interactive application when there is a lot of memory swapped out.


How much is "a lot?" You said 400MB, you can add a few GB of RAM and 
eliminate the problem at that size. Run the application in a virtual 
machine which has enough dedicated memory? I think xen will do that. Run 
"swap" on a ramdisk? I don't think swapoff was designed as a fast 
operation, although your performance is pretty leisurely. ;-)


I assume you looked at mlock() and it doesn't fit your usage, or you 
don't control the application behavior, or its limitations make it 
unsuitable in some other way.



Another question, if this is during system shutdown, maybe that's a
valid case for flushing most of the pagecache first (from userspace)
since most of what's there won't be used again anyway. If that's enough
to make this go faster...


Shutdown isn't a concern here.


A third question, have you investigated what happens if a process gets
killed that has pages in swap; as long as we don't page those in but
just forget about them, that would solve the shutdown problem nicely
(since we kill stuff first anyway there)


According to top, those pages in swap disappear when the process is
killed. So, I don't think there are any swap-related performance issues
on the shutdown path.

Thanks.



--
Bill Davidsen <[EMAIL PROTECTED]>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-30 Thread Daniel Drake
On Thu, 2007-08-30 at 11:36 +0100, Hugh Dickins wrote:
> Regarding Daniel's use of swapoff: it's a very heavy sledgehammer
> for cracking that nut, I strongly agree with those who have pointed
> him to mlock and mlockall instead.

There are some issues with us using mlockall. Admittedly, most/all of
them are not the kernels problem (but a fast swapoff would be a good
workaround):

We're using python 2.4, so mlock() itself isn't really an option (we
don't realistically have access to the address regions hidden behind the
language). mlockall() is a possibility, but the fact that all
allocations above a particular limit will fail would potentially cause
us problems given that it's hard to control python's memory usage for a
long-running application.

Additionally, choosing that limit is hard given that we have this
real-time and non-real-time processing balance, plus an interactive
python-based application that runs all the time (which is the thing we
would be locking). python 2.4 never returns memory to the OS, so at
whatever point the memory usage of the application peaks, all that
memory remains locked permanently.

In addition we have the non-real-time processing task which does benefit
from having more memory available, so in that case, we would want it to
swap out parts of the application. I guess we could ask the application
to do munlockall() here, but things start getting scary and
overcomplicated at this point...

So, our arguments against mlockall() are not strong, but you can see why
fast swapoff would be mighty convenient.

Thanks for all the info so far. It does sound like my earlier idea
wouldn't be any faster in the general case due to excess disk seeking.
Oh well...

-- 
Daniel Drake
Brontes Technologies, A 3M Company
http://www.brontes3d.com/opensource

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-30 Thread Xavier Bestel
On Thu, 2007-08-30 at 16:06 +0200, Helge Hafting wrote:
> Xavier Bestel wrote:
> > On Thu, 2007-08-30 at 15:55 +0200, Helge Hafting wrote:
> >   
> >> If the swap device is full, then there is no need for random
> >> seeks as the swap pages can be read in disk order.
> >> 
> >
> > If the swap file is full, you probably have a machine dead into a swap
> > storm.
> Only if you have enough swap. :-)

Yeah, sure. But these days disk space is cheap and I tend to put too big
swap partitions, and I always regret it later ...

Xav


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-30 Thread Helge Hafting

Xavier Bestel wrote:

On Thu, 2007-08-30 at 15:55 +0200, Helge Hafting wrote:
  

If the swap device is full, then there is no need for random
seeks as the swap pages can be read in disk order.



If the swap file is full, you probably have a machine dead into a swap
storm.

Only if you have enough swap. :-)

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-30 Thread Xavier Bestel
On Thu, 2007-08-30 at 15:55 +0200, Helge Hafting wrote:
> If the swap device is full, then there is no need for random
> seeks as the swap pages can be read in disk order.

If the swap file is full, you probably have a machine dead into a swap
storm.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-30 Thread Helge Hafting

Robert Hancock wrote:

Daniel Drake wrote:

On Wed, 2007-08-29 at 07:30 -0700, Arjan van de Ven wrote:

My experiments show that when there is not much free physical memory,
swapoff moves pages out of swap at a rate of approximately 5mb/sec.

sounds like about disk speed (at random-seek IO pattern)


We are only using 'standard' seagate SATA disks, but I would have
thought much more performance (40+ mb/sec) would be reachable.


Not if it is doing random seeks..

If the swap device is full, then there is no need for random
seeks as the swap pages can be read in disk order. A not
so full swap will skip over the unused areas, the time
needed should still be limited to the time needed for reading the
whole swap device.

If this optimization is worth it is another problem though.

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-30 Thread Hugh Dickins
On Thu, 30 Aug 2007, Eric W. Biederman wrote:
> 
> There is one other possibility.  Typically the swap code is using
> compatibility disk I/O functions instead of the best the kernel
> can offer.  I haven't looked recently but it might be worth just
> making certain that there isn't some low-level optimization or
> cleanup possible on that path.  Although I may just be thinking
> of swapfiles.

Andrew rewrote swapfile support in 2.5, making it use FIBMAP at
swapon time: so that in 2.6 swapfiles are as deadlock-free and
as efficient (unless the swapfile happens to be badly fragmented)
as raw disk partitions.

There's certainly scope for a study of I/O patterns in swapping,
it's hard to imagine that improvements couldn't be made (but also
easy to imagine endless disputes over different kinds of workload).
But most people would appreciate an improvement in active swapping,
and not care very much about the swapoff.

Regarding Daniel's use of swapoff: it's a very heavy sledgehammer
for cracking that nut, I strongly agree with those who have pointed
him to mlock and mlockall instead.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-30 Thread Eric W. Biederman
Hugh Dickins <[EMAIL PROTECTED]> writes:

> The speedups I've imagined making, were a need demonstrated, have
> been more on the lines of batching (dealing with a range of pages
> in one go) and hashing (using the swapmap's ushort, so often 1 or
> 2 or 3, to hold an indicator of where to look for its references).

There is one other possibility.  Typically the swap code is using
compatibility disk I/O functions instead of the best the kernel
can offer.  I haven't looked recently but it might be worth just
making certain that there isn't some low-level optimization or
cleanup possible on that path.  Although I may just be thinking
of swapfiles.

I know there were tremendous gains ago when I removed the functions
that wrote pages synchronously to swapfiles.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-30 Thread Eric W. Biederman
Hugh Dickins [EMAIL PROTECTED] writes:

 The speedups I've imagined making, were a need demonstrated, have
 been more on the lines of batching (dealing with a range of pages
 in one go) and hashing (using the swapmap's ushort, so often 1 or
 2 or 3, to hold an indicator of where to look for its references).

There is one other possibility.  Typically the swap code is using
compatibility disk I/O functions instead of the best the kernel
can offer.  I haven't looked recently but it might be worth just
making certain that there isn't some low-level optimization or
cleanup possible on that path.  Although I may just be thinking
of swapfiles.

I know there were tremendous gains ago when I removed the functions
that wrote pages synchronously to swapfiles.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-30 Thread Hugh Dickins
On Thu, 30 Aug 2007, Eric W. Biederman wrote:
 
 There is one other possibility.  Typically the swap code is using
 compatibility disk I/O functions instead of the best the kernel
 can offer.  I haven't looked recently but it might be worth just
 making certain that there isn't some low-level optimization or
 cleanup possible on that path.  Although I may just be thinking
 of swapfiles.

Andrew rewrote swapfile support in 2.5, making it use FIBMAP at
swapon time: so that in 2.6 swapfiles are as deadlock-free and
as efficient (unless the swapfile happens to be badly fragmented)
as raw disk partitions.

There's certainly scope for a study of I/O patterns in swapping,
it's hard to imagine that improvements couldn't be made (but also
easy to imagine endless disputes over different kinds of workload).
But most people would appreciate an improvement in active swapping,
and not care very much about the swapoff.

Regarding Daniel's use of swapoff: it's a very heavy sledgehammer
for cracking that nut, I strongly agree with those who have pointed
him to mlock and mlockall instead.

Hugh
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-30 Thread Helge Hafting

Robert Hancock wrote:

Daniel Drake wrote:

On Wed, 2007-08-29 at 07:30 -0700, Arjan van de Ven wrote:

My experiments show that when there is not much free physical memory,
swapoff moves pages out of swap at a rate of approximately 5mb/sec.

sounds like about disk speed (at random-seek IO pattern)


We are only using 'standard' seagate SATA disks, but I would have
thought much more performance (40+ mb/sec) would be reachable.


Not if it is doing random seeks..

If the swap device is full, then there is no need for random
seeks as the swap pages can be read in disk order. A not
so full swap will skip over the unused areas, the time
needed should still be limited to the time needed for reading the
whole swap device.

If this optimization is worth it is another problem though.

Helge Hafting
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-30 Thread Xavier Bestel
On Thu, 2007-08-30 at 15:55 +0200, Helge Hafting wrote:
 If the swap device is full, then there is no need for random
 seeks as the swap pages can be read in disk order.

If the swap file is full, you probably have a machine dead into a swap
storm.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-30 Thread Helge Hafting

Xavier Bestel wrote:

On Thu, 2007-08-30 at 15:55 +0200, Helge Hafting wrote:
  

If the swap device is full, then there is no need for random
seeks as the swap pages can be read in disk order.



If the swap file is full, you probably have a machine dead into a swap
storm.

Only if you have enough swap. :-)

Helge Hafting
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-30 Thread Xavier Bestel
On Thu, 2007-08-30 at 16:06 +0200, Helge Hafting wrote:
 Xavier Bestel wrote:
  On Thu, 2007-08-30 at 15:55 +0200, Helge Hafting wrote:

  If the swap device is full, then there is no need for random
  seeks as the swap pages can be read in disk order.
  
 
  If the swap file is full, you probably have a machine dead into a swap
  storm.
 Only if you have enough swap. :-)

Yeah, sure. But these days disk space is cheap and I tend to put too big
swap partitions, and I always regret it later ...

Xav


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-30 Thread Daniel Drake
On Thu, 2007-08-30 at 11:36 +0100, Hugh Dickins wrote:
 Regarding Daniel's use of swapoff: it's a very heavy sledgehammer
 for cracking that nut, I strongly agree with those who have pointed
 him to mlock and mlockall instead.

There are some issues with us using mlockall. Admittedly, most/all of
them are not the kernels problem (but a fast swapoff would be a good
workaround):

We're using python 2.4, so mlock() itself isn't really an option (we
don't realistically have access to the address regions hidden behind the
language). mlockall() is a possibility, but the fact that all
allocations above a particular limit will fail would potentially cause
us problems given that it's hard to control python's memory usage for a
long-running application.

Additionally, choosing that limit is hard given that we have this
real-time and non-real-time processing balance, plus an interactive
python-based application that runs all the time (which is the thing we
would be locking). python 2.4 never returns memory to the OS, so at
whatever point the memory usage of the application peaks, all that
memory remains locked permanently.

In addition we have the non-real-time processing task which does benefit
from having more memory available, so in that case, we would want it to
swap out parts of the application. I guess we could ask the application
to do munlockall() here, but things start getting scary and
overcomplicated at this point...

So, our arguments against mlockall() are not strong, but you can see why
fast swapoff would be mighty convenient.

Thanks for all the info so far. It does sound like my earlier idea
wouldn't be any faster in the general case due to excess disk seeking.
Oh well...

-- 
Daniel Drake
Brontes Technologies, A 3M Company
http://www.brontes3d.com/opensource

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-30 Thread Bill Davidsen

Daniel Drake wrote:

On Wed, 2007-08-29 at 07:30 -0700, Arjan van de Ven wrote:

My experiments show that when there is not much free physical memory,
swapoff moves pages out of swap at a rate of approximately 5mb/sec.

sounds like about disk speed (at random-seek IO pattern)


We are only using 'standard' seagate SATA disks, but I would have
thought much more performance (40+ mb/sec) would be reachable.


before you go there... is this a real life problem? Or just a
mostly-artificial corner case? (the answer to that obviously is
relevant for the 'should we really care' question)


It's more-or-less a real life problem. We have an interactive
application which, when triggered by the user, performs rendering tasks
which must operate in real-time. In attempt to secure performance, we
want to ensure everything is memory resident and that nothing might be
swapped out during the process. So, we run swapoff at that time.


So the real issue isn't that your process doesn't run fast enough 
without doing swapoff, but that swapoff itself takes too long.


If there is a decent number of pages swapped out, the user sits for a
while at a 'please wait' screen, which is not desirable. To throw some
numbers out there, likely more than a minute for 400mb of swapped pages.

Sure, we could run the whole interactive application with swap disabled,
which is pretty much what we do. However we have other non-real-time
processing tasks which are very memory hungry and do require swap. So,
there are 'corner cases' where the user can reach the real-time part of
the interactive application when there is a lot of memory swapped out.


How much is a lot? You said 400MB, you can add a few GB of RAM and 
eliminate the problem at that size. Run the application in a virtual 
machine which has enough dedicated memory? I think xen will do that. Run 
swap on a ramdisk? I don't think swapoff was designed as a fast 
operation, although your performance is pretty leisurely. ;-)


I assume you looked at mlock() and it doesn't fit your usage, or you 
don't control the application behavior, or its limitations make it 
unsuitable in some other way.



Another question, if this is during system shutdown, maybe that's a
valid case for flushing most of the pagecache first (from userspace)
since most of what's there won't be used again anyway. If that's enough
to make this go faster...


Shutdown isn't a concern here.


A third question, have you investigated what happens if a process gets
killed that has pages in swap; as long as we don't page those in but
just forget about them, that would solve the shutdown problem nicely
(since we kill stuff first anyway there)


According to top, those pages in swap disappear when the process is
killed. So, I don't think there are any swap-related performance issues
on the shutdown path.

Thanks.



--
Bill Davidsen [EMAIL PROTECTED]
  We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Robert Hancock

Daniel Drake wrote:

On Wed, 2007-08-29 at 07:30 -0700, Arjan van de Ven wrote:

My experiments show that when there is not much free physical memory,
swapoff moves pages out of swap at a rate of approximately 5mb/sec.

sounds like about disk speed (at random-seek IO pattern)


We are only using 'standard' seagate SATA disks, but I would have
thought much more performance (40+ mb/sec) would be reachable.


Not if it is doing random seeks..




before you go there... is this a "real life" problem? Or just a
mostly-artificial corner case? (the answer to that obviously is
relevant for the 'should we really care' question)


It's more-or-less a real life problem. We have an interactive
application which, when triggered by the user, performs rendering tasks
which must operate in real-time. In attempt to secure performance, we
want to ensure everything is memory resident and that nothing might be
swapped out during the process. So, we run swapoff at that time.

If there is a decent number of pages swapped out, the user sits for a
while at a 'please wait' screen, which is not desirable. To throw some
numbers out there, likely more than a minute for 400mb of swapped pages.

Sure, we could run the whole interactive application with swap disabled,
which is pretty much what we do. However we have other non-real-time
processing tasks which are very memory hungry and do require swap. So,
there are 'corner cases' where the user can reach the real-time part of
the interactive application when there is a lot of memory swapped out.


Normally mlockall is what is used in this sort of situation, that way it 
doesn't force all swapped data in for every app. It's possible that 
calling this with lots of swapped pages in the app at the time may have 
the same problem though.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Oliver Neukum
Am Mittwoch 29 August 2007 schrieb Hugh Dickins:
> On Wed, 29 Aug 2007, Oliver Neukum wrote:
> > Am Mittwoch 29 August 2007 schrieb Arjan van de Ven:
> > > Another question, if this is during system shutdown, maybe that's a
> > > valid case for flushing most of the pagecache first (from userspace)
> > > since most of what's there won't be used again anyway. If that's enough
> > > to make this go faster...
> > 
> > Is there a good reason to swapoff during shutdown?
> 
> Three reasons, I think, only one of them compelling:
> 
> 1. Tidiness.
> 2. So swapoff gets testing and I get to hear of any bugs in it.
> 3. If a regular swapfile is used instead of a disk partition, you
>    need to swapoff before its filesystem can be unmounted cleanly.

Yes. I hadn't thought of that. I am using a dedicated disk.

Regards
Oliver

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Juergen Beisert
On Wednesday 29 August 2007 16:44, Daniel Drake wrote:
> On Wed, 2007-08-29 at 07:30 -0700, Arjan van de Ven wrote:
> > > My experiments show that when there is not much free physical memory,
> > > swapoff moves pages out of swap at a rate of approximately 5mb/sec.
> >
> > sounds like about disk speed (at random-seek IO pattern)
>
> We are only using 'standard' seagate SATA disks, but I would have
> thought much more performance (40+ mb/sec) would be reachable.
>
> > before you go there... is this a "real life" problem? Or just a
> > mostly-artificial corner case? (the answer to that obviously is
> > relevant for the 'should we really care' question)
>
> It's more-or-less a real life problem. We have an interactive
> application which, when triggered by the user, performs rendering tasks
> which must operate in real-time. In attempt to secure performance, we
> want to ensure everything is memory resident and that nothing might be
> swapped out during the process. So, we run swapoff at that time.

Did you play with mlock()?

Juergen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Lee Schermerhorn
On Wed, 2007-08-29 at 09:29 -0400, Daniel Drake wrote:
> Hi,
> 
> I've spent some time trying to understand why swapoff is such a slow
> operation.
> 
> My experiments show that when there is not much free physical memory,
> swapoff moves pages out of swap at a rate of approximately 5mb/sec. When
> there is a lot of free physical memory, it is faster but still a slow
> CPU-intensive operation, purging swap at about 20mb/sec.
> 
> I've read into the swap code and I have some understanding that this is
> an expensive operation (and has to be). This page was very helpful and
> also agrees:
> http://kernel.org/doc/gorman/html/understand/understand014.html
> 
> After reading that, I have an idea for a possible optimization. If we
> were to create a system call to disable ALL swap partitions (or modify
> the existing one to accept NULL for that purpose), could this process be
> signficantly less complex?
> 
> I'm thinking we could do something like this:
>  1. Prevent any more pages from being swapped out from this point
>  2. Iterate through all process page tables, paging all swapped
> pages back into physical memory and updating PTEs
>  3. Clear all swap tables and caches
> 
> Due to only iterating through process page tables once, does this sound
> like it would increase performance non-trivially? Is it feasible?
> 
> I'm happy to spend a few more hours looking into implementing this but
> would greatly appreciate any advice from those in-the-know on if my
> ideas are broken to start with...

Daniel:  

in a response, Juergen Beisert asked if you'd tried mlock()  [mlockall()
would probably be a better choice] to lock your application into memory.
That would require modifying the application.  Don't know if you want to
do that.

Back in Feb'07, I posted an RFC regarding [optionally] inheriting
mlockall() semantics across fork and exec.  The original posting is
here:

http://marc.info/?l=linux-mm=117217855508612=4

The patch is quite stale now [against 20-rc], but shouldn't
be too much work to rebase to something more recent.  The patch
description points to an ad hoc mlock "prefix command" that would allow
you to:

mlock 

and run the application as if it had called "mlockall(MCL_CURRENT|
MCL_FUTURE)", without having to modify the application--if that's
something you can't or don't want to do.

Maybe this would help?

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Hugh Dickins
On Wed, 29 Aug 2007, Oliver Neukum wrote:
> Am Mittwoch 29 August 2007 schrieb Arjan van de Ven:
> > Another question, if this is during system shutdown, maybe that's a
> > valid case for flushing most of the pagecache first (from userspace)
> > since most of what's there won't be used again anyway. If that's enough
> > to make this go faster...
> 
> Is there a good reason to swapoff during shutdown?

Three reasons, I think, only one of them compelling:

1. Tidiness.
2. So swapoff gets testing and I get to hear of any bugs in it.
3. If a regular swapfile is used instead of a disk partition, you
   need to swapoff before its filesystem can be unmounted cleanly.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Hugh Dickins
On Wed, 29 Aug 2007, Arjan van de Ven wrote:
> On Wed, 29 Aug 2007 09:29:32 -0400
> Daniel Drake <[EMAIL PROTECTED]> wrote:
> 
> > I've spent some time trying to understand why swapoff is such a slow
> > operation.
> > 
> > My experiments show that when there is not much free physical memory,
> > swapoff moves pages out of swap at a rate of approximately 5mb/sec.
> 
> sounds like about disk speed (at random-seek IO pattern)

The present method should be reading sequentially (with gaps),
rather than randomly.  Perhaps we need to check what's happening
in practice.

(I've often dithered over whether we should be doing swap readahead
there or not: at present it does not, preferring to assume buffering
at the hardware level, and last time I checked that worked out a
little better.)

> Another question, if this is during system shutdown, maybe that's a
> valid case for flushing most of the pagecache first (from userspace)
> since most of what's there won't be used again anyway. If that's enough
> to make this go faster...

(I didn't understand your point there, but Daniel has replied that
it's not at shutdown anyway.)

> A third question, have you investigated what happens if a process gets
> killed that has pages in swap; as long as we don't page those in but
> just forget about them, that would solve the shutdown problem nicely
> (since we kill stuff first anyway there)

We definitely don't page those in, it would be a disaster for process
exit if we did: they just get discarded.

As you say, shutdown is rarely a big issue, because almost all the
processes which had stuff in swap have already been killed.  tmpfs
use of swap can be an issue there, but if the distro is wise, it'll
do things in such an order that tmpfs'es are unmounted before swapoff
(but may need two passes: the opposite case is a regular swapfile,
where we need to swapoff before that partition can be unmounted).

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Hugh Dickins
On Wed, 29 Aug 2007, Daniel Drake wrote:
> 
> I've spent some time trying to understand why swapoff is such a slow
> operation.
> 
> My experiments show that when there is not much free physical memory,
> swapoff moves pages out of swap at a rate of approximately 5mb/sec. When
> there is a lot of free physical memory, it is faster but still a slow
> CPU-intensive operation, purging swap at about 20mb/sec.

Yes, it can be shamefully slow.  But we've done nothing about it for
years, simply because very few actually suffer from its worst cases.
You're the first I've heard complain about it in a long time: perhaps
you'll be joined by a chorus, and we can have fun looking at it again.

> 
> I've read into the swap code and I have some understanding that this is
> an expensive operation (and has to be). This page was very helpful and
> also agrees:
> http://kernel.org/doc/gorman/html/understand/understand014.html
> 
> After reading that, I have an idea for a possible optimization. If we
> were to create a system call to disable ALL swap partitions (or modify
> the existing one to accept NULL for that purpose), could this process be
> signficantly less complex?

I'd be quite strongly against an additional system call: if we're
going to speed it up, let's speed up the common case, not your special
additional call.  But I don't think you need that anyway: the slowness
doesn't come from the limited number of swap areas, but from the much
greater numbers of processes and their pages.  Looping over the number
of swap areas (so often 1) isn't a problem.

> 
> I'm thinking we could do something like this:
>  1. Prevent any more pages from being swapped out from this point
>  2. Iterate through all process page tables, paging all swapped
> pages back into physical memory and updating PTEs
>  3. Clear all swap tables and caches
> 
> Due to only iterating through process page tables once, does this sound
> like it would increase performance non-trivially? Is it feasible?

I'll ignore your steps 1 and 3, I don't see the advantage.  (We
do already prevent pages from being swapped out to the area we're
swapping off, and in general we need to allow for swapping out to
another area while swapping off.)  Step 2 is the core of your idea.

Feasible yes, and very much less CPU-intensive than the present method.
But... it would be reading in pages from swap in pretty much a random
order, whereas the present method is reading them in sequentially, to
minimize disk seek time.  So I doubt your way would actually work out
faster, except in those (exceptional, I'm afraid) cases where almost
all the swap pages are already in core swapcache when swapoff begins.

> 
> I'm happy to spend a few more hours looking into implementing this but
> would greatly appreciate any advice from those in-the-know on if my
> ideas are broken to start with...

Well, do give it a try if you're interested: I've never actually
timed doing it that way, and might be surprised.  I doubt you could
actually remove the present code, but it could become a fallback to
clear up the loose ends after some faster first pass.

Don't forget you'll also need to deal with tmpfs files (mm/shmem.c):
Christoph Rohland long ago had a patch to work on those in the way you
propose, but we never integrated it because of the random seek issue.

The speedups I've imagined making, were a need demonstrated, have
been more on the lines of batching (dealing with a range of pages
in one go) and hashing (using the swapmap's ushort, so often 1 or
2 or 3, to hold an indicator of where to look for its references).

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Daniel Drake
On Wed, 2007-08-29 at 07:30 -0700, Arjan van de Ven wrote:
> > My experiments show that when there is not much free physical memory,
> > swapoff moves pages out of swap at a rate of approximately 5mb/sec.
> 
> sounds like about disk speed (at random-seek IO pattern)

We are only using 'standard' seagate SATA disks, but I would have
thought much more performance (40+ mb/sec) would be reachable.

> before you go there... is this a "real life" problem? Or just a
> mostly-artificial corner case? (the answer to that obviously is
> relevant for the 'should we really care' question)

It's more-or-less a real life problem. We have an interactive
application which, when triggered by the user, performs rendering tasks
which must operate in real-time. In attempt to secure performance, we
want to ensure everything is memory resident and that nothing might be
swapped out during the process. So, we run swapoff at that time.

If there is a decent number of pages swapped out, the user sits for a
while at a 'please wait' screen, which is not desirable. To throw some
numbers out there, likely more than a minute for 400mb of swapped pages.

Sure, we could run the whole interactive application with swap disabled,
which is pretty much what we do. However we have other non-real-time
processing tasks which are very memory hungry and do require swap. So,
there are 'corner cases' where the user can reach the real-time part of
the interactive application when there is a lot of memory swapped out.

> Another question, if this is during system shutdown, maybe that's a
> valid case for flushing most of the pagecache first (from userspace)
> since most of what's there won't be used again anyway. If that's enough
> to make this go faster...

Shutdown isn't a concern here.

> A third question, have you investigated what happens if a process gets
> killed that has pages in swap; as long as we don't page those in but
> just forget about them, that would solve the shutdown problem nicely
> (since we kill stuff first anyway there)

According to top, those pages in swap disappear when the process is
killed. So, I don't think there are any swap-related performance issues
on the shutdown path.

Thanks.
-- 
Daniel Drake
Brontes Technologies, A 3M Company
http://www.brontes3d.com/opensource

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Oliver Neukum
Am Mittwoch 29 August 2007 schrieb Arjan van de Ven:
> Another question, if this is during system shutdown, maybe that's a
> valid case for flushing most of the pagecache first (from userspace)
> since most of what's there won't be used again anyway. If that's enough
> to make this go faster...

Is there a good reason to swapoff during shutdown?

Regards
Oliver

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Arjan van de Ven
On Wed, 29 Aug 2007 09:29:32 -0400
Daniel Drake <[EMAIL PROTECTED]> wrote:


Hi,

> I've spent some time trying to understand why swapoff is such a slow
> operation.
> 
> My experiments show that when there is not much free physical memory,
> swapoff moves pages out of swap at a rate of approximately 5mb/sec.

sounds like about disk speed (at random-seek IO pattern)


> I'm happy to spend a few more hours looking into implementing this but
> would greatly appreciate any advice from those in-the-know on if my
> ideas are broken to start with...

before you go there... is this a "real life" problem? Or just a
mostly-artificial corner case? (the answer to that obviously is
relevant for the 'should we really care' question)

Another question, if this is during system shutdown, maybe that's a
valid case for flushing most of the pagecache first (from userspace)
since most of what's there won't be used again anyway. If that's enough
to make this go faster...

A third question, have you investigated what happens if a process gets
killed that has pages in swap; as long as we don't page those in but
just forget about them, that would solve the shutdown problem nicely
(since we kill stuff first anyway there)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


speeding up swapoff

2007-08-29 Thread Daniel Drake
Hi,

I've spent some time trying to understand why swapoff is such a slow
operation.

My experiments show that when there is not much free physical memory,
swapoff moves pages out of swap at a rate of approximately 5mb/sec. When
there is a lot of free physical memory, it is faster but still a slow
CPU-intensive operation, purging swap at about 20mb/sec.

I've read into the swap code and I have some understanding that this is
an expensive operation (and has to be). This page was very helpful and
also agrees:
http://kernel.org/doc/gorman/html/understand/understand014.html

After reading that, I have an idea for a possible optimization. If we
were to create a system call to disable ALL swap partitions (or modify
the existing one to accept NULL for that purpose), could this process be
signficantly less complex?

I'm thinking we could do something like this:
 1. Prevent any more pages from being swapped out from this point
 2. Iterate through all process page tables, paging all swapped
pages back into physical memory and updating PTEs
 3. Clear all swap tables and caches

Due to only iterating through process page tables once, does this sound
like it would increase performance non-trivially? Is it feasible?

I'm happy to spend a few more hours looking into implementing this but
would greatly appreciate any advice from those in-the-know on if my
ideas are broken to start with...

Thanks!
-- 
Daniel Drake
Brontes Technologies, A 3M Company
http://www.brontes3d.com/opensource

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


speeding up swapoff

2007-08-29 Thread Daniel Drake
Hi,

I've spent some time trying to understand why swapoff is such a slow
operation.

My experiments show that when there is not much free physical memory,
swapoff moves pages out of swap at a rate of approximately 5mb/sec. When
there is a lot of free physical memory, it is faster but still a slow
CPU-intensive operation, purging swap at about 20mb/sec.

I've read into the swap code and I have some understanding that this is
an expensive operation (and has to be). This page was very helpful and
also agrees:
http://kernel.org/doc/gorman/html/understand/understand014.html

After reading that, I have an idea for a possible optimization. If we
were to create a system call to disable ALL swap partitions (or modify
the existing one to accept NULL for that purpose), could this process be
signficantly less complex?

I'm thinking we could do something like this:
 1. Prevent any more pages from being swapped out from this point
 2. Iterate through all process page tables, paging all swapped
pages back into physical memory and updating PTEs
 3. Clear all swap tables and caches

Due to only iterating through process page tables once, does this sound
like it would increase performance non-trivially? Is it feasible?

I'm happy to spend a few more hours looking into implementing this but
would greatly appreciate any advice from those in-the-know on if my
ideas are broken to start with...

Thanks!
-- 
Daniel Drake
Brontes Technologies, A 3M Company
http://www.brontes3d.com/opensource

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Arjan van de Ven
On Wed, 29 Aug 2007 09:29:32 -0400
Daniel Drake [EMAIL PROTECTED] wrote:


Hi,

 I've spent some time trying to understand why swapoff is such a slow
 operation.
 
 My experiments show that when there is not much free physical memory,
 swapoff moves pages out of swap at a rate of approximately 5mb/sec.

sounds like about disk speed (at random-seek IO pattern)


 I'm happy to spend a few more hours looking into implementing this but
 would greatly appreciate any advice from those in-the-know on if my
 ideas are broken to start with...

before you go there... is this a real life problem? Or just a
mostly-artificial corner case? (the answer to that obviously is
relevant for the 'should we really care' question)

Another question, if this is during system shutdown, maybe that's a
valid case for flushing most of the pagecache first (from userspace)
since most of what's there won't be used again anyway. If that's enough
to make this go faster...

A third question, have you investigated what happens if a process gets
killed that has pages in swap; as long as we don't page those in but
just forget about them, that would solve the shutdown problem nicely
(since we kill stuff first anyway there)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Oliver Neukum
Am Mittwoch 29 August 2007 schrieb Arjan van de Ven:
 Another question, if this is during system shutdown, maybe that's a
 valid case for flushing most of the pagecache first (from userspace)
 since most of what's there won't be used again anyway. If that's enough
 to make this go faster...

Is there a good reason to swapoff during shutdown?

Regards
Oliver

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Daniel Drake
On Wed, 2007-08-29 at 07:30 -0700, Arjan van de Ven wrote:
  My experiments show that when there is not much free physical memory,
  swapoff moves pages out of swap at a rate of approximately 5mb/sec.
 
 sounds like about disk speed (at random-seek IO pattern)

We are only using 'standard' seagate SATA disks, but I would have
thought much more performance (40+ mb/sec) would be reachable.

 before you go there... is this a real life problem? Or just a
 mostly-artificial corner case? (the answer to that obviously is
 relevant for the 'should we really care' question)

It's more-or-less a real life problem. We have an interactive
application which, when triggered by the user, performs rendering tasks
which must operate in real-time. In attempt to secure performance, we
want to ensure everything is memory resident and that nothing might be
swapped out during the process. So, we run swapoff at that time.

If there is a decent number of pages swapped out, the user sits for a
while at a 'please wait' screen, which is not desirable. To throw some
numbers out there, likely more than a minute for 400mb of swapped pages.

Sure, we could run the whole interactive application with swap disabled,
which is pretty much what we do. However we have other non-real-time
processing tasks which are very memory hungry and do require swap. So,
there are 'corner cases' where the user can reach the real-time part of
the interactive application when there is a lot of memory swapped out.

 Another question, if this is during system shutdown, maybe that's a
 valid case for flushing most of the pagecache first (from userspace)
 since most of what's there won't be used again anyway. If that's enough
 to make this go faster...

Shutdown isn't a concern here.

 A third question, have you investigated what happens if a process gets
 killed that has pages in swap; as long as we don't page those in but
 just forget about them, that would solve the shutdown problem nicely
 (since we kill stuff first anyway there)

According to top, those pages in swap disappear when the process is
killed. So, I don't think there are any swap-related performance issues
on the shutdown path.

Thanks.
-- 
Daniel Drake
Brontes Technologies, A 3M Company
http://www.brontes3d.com/opensource

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Hugh Dickins
On Wed, 29 Aug 2007, Daniel Drake wrote:
 
 I've spent some time trying to understand why swapoff is such a slow
 operation.
 
 My experiments show that when there is not much free physical memory,
 swapoff moves pages out of swap at a rate of approximately 5mb/sec. When
 there is a lot of free physical memory, it is faster but still a slow
 CPU-intensive operation, purging swap at about 20mb/sec.

Yes, it can be shamefully slow.  But we've done nothing about it for
years, simply because very few actually suffer from its worst cases.
You're the first I've heard complain about it in a long time: perhaps
you'll be joined by a chorus, and we can have fun looking at it again.

 
 I've read into the swap code and I have some understanding that this is
 an expensive operation (and has to be). This page was very helpful and
 also agrees:
 http://kernel.org/doc/gorman/html/understand/understand014.html
 
 After reading that, I have an idea for a possible optimization. If we
 were to create a system call to disable ALL swap partitions (or modify
 the existing one to accept NULL for that purpose), could this process be
 signficantly less complex?

I'd be quite strongly against an additional system call: if we're
going to speed it up, let's speed up the common case, not your special
additional call.  But I don't think you need that anyway: the slowness
doesn't come from the limited number of swap areas, but from the much
greater numbers of processes and their pages.  Looping over the number
of swap areas (so often 1) isn't a problem.

 
 I'm thinking we could do something like this:
  1. Prevent any more pages from being swapped out from this point
  2. Iterate through all process page tables, paging all swapped
 pages back into physical memory and updating PTEs
  3. Clear all swap tables and caches
 
 Due to only iterating through process page tables once, does this sound
 like it would increase performance non-trivially? Is it feasible?

I'll ignore your steps 1 and 3, I don't see the advantage.  (We
do already prevent pages from being swapped out to the area we're
swapping off, and in general we need to allow for swapping out to
another area while swapping off.)  Step 2 is the core of your idea.

Feasible yes, and very much less CPU-intensive than the present method.
But... it would be reading in pages from swap in pretty much a random
order, whereas the present method is reading them in sequentially, to
minimize disk seek time.  So I doubt your way would actually work out
faster, except in those (exceptional, I'm afraid) cases where almost
all the swap pages are already in core swapcache when swapoff begins.

 
 I'm happy to spend a few more hours looking into implementing this but
 would greatly appreciate any advice from those in-the-know on if my
 ideas are broken to start with...

Well, do give it a try if you're interested: I've never actually
timed doing it that way, and might be surprised.  I doubt you could
actually remove the present code, but it could become a fallback to
clear up the loose ends after some faster first pass.

Don't forget you'll also need to deal with tmpfs files (mm/shmem.c):
Christoph Rohland long ago had a patch to work on those in the way you
propose, but we never integrated it because of the random seek issue.

The speedups I've imagined making, were a need demonstrated, have
been more on the lines of batching (dealing with a range of pages
in one go) and hashing (using the swapmap's ushort, so often 1 or
2 or 3, to hold an indicator of where to look for its references).

Hugh
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Hugh Dickins
On Wed, 29 Aug 2007, Arjan van de Ven wrote:
 On Wed, 29 Aug 2007 09:29:32 -0400
 Daniel Drake [EMAIL PROTECTED] wrote:
 
  I've spent some time trying to understand why swapoff is such a slow
  operation.
  
  My experiments show that when there is not much free physical memory,
  swapoff moves pages out of swap at a rate of approximately 5mb/sec.
 
 sounds like about disk speed (at random-seek IO pattern)

The present method should be reading sequentially (with gaps),
rather than randomly.  Perhaps we need to check what's happening
in practice.

(I've often dithered over whether we should be doing swap readahead
there or not: at present it does not, preferring to assume buffering
at the hardware level, and last time I checked that worked out a
little better.)

 Another question, if this is during system shutdown, maybe that's a
 valid case for flushing most of the pagecache first (from userspace)
 since most of what's there won't be used again anyway. If that's enough
 to make this go faster...

(I didn't understand your point there, but Daniel has replied that
it's not at shutdown anyway.)

 A third question, have you investigated what happens if a process gets
 killed that has pages in swap; as long as we don't page those in but
 just forget about them, that would solve the shutdown problem nicely
 (since we kill stuff first anyway there)

We definitely don't page those in, it would be a disaster for process
exit if we did: they just get discarded.

As you say, shutdown is rarely a big issue, because almost all the
processes which had stuff in swap have already been killed.  tmpfs
use of swap can be an issue there, but if the distro is wise, it'll
do things in such an order that tmpfs'es are unmounted before swapoff
(but may need two passes: the opposite case is a regular swapfile,
where we need to swapoff before that partition can be unmounted).

Hugh
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Hugh Dickins
On Wed, 29 Aug 2007, Oliver Neukum wrote:
 Am Mittwoch 29 August 2007 schrieb Arjan van de Ven:
  Another question, if this is during system shutdown, maybe that's a
  valid case for flushing most of the pagecache first (from userspace)
  since most of what's there won't be used again anyway. If that's enough
  to make this go faster...
 
 Is there a good reason to swapoff during shutdown?

Three reasons, I think, only one of them compelling:

1. Tidiness.
2. So swapoff gets testing and I get to hear of any bugs in it.
3. If a regular swapfile is used instead of a disk partition, you
   need to swapoff before its filesystem can be unmounted cleanly.

Hugh
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Lee Schermerhorn
On Wed, 2007-08-29 at 09:29 -0400, Daniel Drake wrote:
 Hi,
 
 I've spent some time trying to understand why swapoff is such a slow
 operation.
 
 My experiments show that when there is not much free physical memory,
 swapoff moves pages out of swap at a rate of approximately 5mb/sec. When
 there is a lot of free physical memory, it is faster but still a slow
 CPU-intensive operation, purging swap at about 20mb/sec.
 
 I've read into the swap code and I have some understanding that this is
 an expensive operation (and has to be). This page was very helpful and
 also agrees:
 http://kernel.org/doc/gorman/html/understand/understand014.html
 
 After reading that, I have an idea for a possible optimization. If we
 were to create a system call to disable ALL swap partitions (or modify
 the existing one to accept NULL for that purpose), could this process be
 signficantly less complex?
 
 I'm thinking we could do something like this:
  1. Prevent any more pages from being swapped out from this point
  2. Iterate through all process page tables, paging all swapped
 pages back into physical memory and updating PTEs
  3. Clear all swap tables and caches
 
 Due to only iterating through process page tables once, does this sound
 like it would increase performance non-trivially? Is it feasible?
 
 I'm happy to spend a few more hours looking into implementing this but
 would greatly appreciate any advice from those in-the-know on if my
 ideas are broken to start with...

Daniel:  

in a response, Juergen Beisert asked if you'd tried mlock()  [mlockall()
would probably be a better choice] to lock your application into memory.
That would require modifying the application.  Don't know if you want to
do that.

Back in Feb'07, I posted an RFC regarding [optionally] inheriting
mlockall() semantics across fork and exec.  The original posting is
here:

http://marc.info/?l=linux-mmm=117217855508612w=4

The patch is quite stale now [against 20-rcsomething], but shouldn't
be too much work to rebase to something more recent.  The patch
description points to an ad hoc mlock prefix command that would allow
you to:

mlock some application

and run the application as if it had called mlockall(MCL_CURRENT|
MCL_FUTURE), without having to modify the application--if that's
something you can't or don't want to do.

Maybe this would help?

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Juergen Beisert
On Wednesday 29 August 2007 16:44, Daniel Drake wrote:
 On Wed, 2007-08-29 at 07:30 -0700, Arjan van de Ven wrote:
   My experiments show that when there is not much free physical memory,
   swapoff moves pages out of swap at a rate of approximately 5mb/sec.
 
  sounds like about disk speed (at random-seek IO pattern)

 We are only using 'standard' seagate SATA disks, but I would have
 thought much more performance (40+ mb/sec) would be reachable.

  before you go there... is this a real life problem? Or just a
  mostly-artificial corner case? (the answer to that obviously is
  relevant for the 'should we really care' question)

 It's more-or-less a real life problem. We have an interactive
 application which, when triggered by the user, performs rendering tasks
 which must operate in real-time. In attempt to secure performance, we
 want to ensure everything is memory resident and that nothing might be
 swapped out during the process. So, we run swapoff at that time.

Did you play with mlock()?

Juergen
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Oliver Neukum
Am Mittwoch 29 August 2007 schrieb Hugh Dickins:
 On Wed, 29 Aug 2007, Oliver Neukum wrote:
  Am Mittwoch 29 August 2007 schrieb Arjan van de Ven:
   Another question, if this is during system shutdown, maybe that's a
   valid case for flushing most of the pagecache first (from userspace)
   since most of what's there won't be used again anyway. If that's enough
   to make this go faster...
  
  Is there a good reason to swapoff during shutdown?
 
 Three reasons, I think, only one of them compelling:
 
 1. Tidiness.
 2. So swapoff gets testing and I get to hear of any bugs in it.
 3. If a regular swapfile is used instead of a disk partition, you
    need to swapoff before its filesystem can be unmounted cleanly.

Yes. I hadn't thought of that. I am using a dedicated disk.

Regards
Oliver

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: speeding up swapoff

2007-08-29 Thread Robert Hancock

Daniel Drake wrote:

On Wed, 2007-08-29 at 07:30 -0700, Arjan van de Ven wrote:

My experiments show that when there is not much free physical memory,
swapoff moves pages out of swap at a rate of approximately 5mb/sec.

sounds like about disk speed (at random-seek IO pattern)


We are only using 'standard' seagate SATA disks, but I would have
thought much more performance (40+ mb/sec) would be reachable.


Not if it is doing random seeks..




before you go there... is this a real life problem? Or just a
mostly-artificial corner case? (the answer to that obviously is
relevant for the 'should we really care' question)


It's more-or-less a real life problem. We have an interactive
application which, when triggered by the user, performs rendering tasks
which must operate in real-time. In attempt to secure performance, we
want to ensure everything is memory resident and that nothing might be
swapped out during the process. So, we run swapoff at that time.

If there is a decent number of pages swapped out, the user sits for a
while at a 'please wait' screen, which is not desirable. To throw some
numbers out there, likely more than a minute for 400mb of swapped pages.

Sure, we could run the whole interactive application with swap disabled,
which is pretty much what we do. However we have other non-real-time
processing tasks which are very memory hungry and do require swap. So,
there are 'corner cases' where the user can reach the real-time part of
the interactive application when there is a lot of memory swapped out.


Normally mlockall is what is used in this sort of situation, that way it 
doesn't force all swapped data in for every app. It's possible that 
calling this with lots of swapped pages in the app at the time may have 
the same problem though.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/