Re: 2.4 and 2GB swap partition limit

2001-05-03 Thread Andreas Dilger

Andi Kleen writes:
> On Mon, Apr 30, 2001 at 12:07:44PM -0700, David S. Miller wrote:
> But something must have been not working with it for mmaps/shlibs 
> (not executables); at least historically.
> At least I remember that all hell broke lose when you tried to update
> libc by cp'ing a new one to /lib/libc.so in the 1.2 days. cp should create a 
> new inode (it uses O_CREAT) so in theory it should be coherent by the inode 
> reference; but somehow it didn't use to work and random already running 
> programs started to segfault. This was long ago. I wonder if old
> GNU cp used O_TRUNC instead of O_CREAT, or was there some other kernel bug 
> with mappings (hopefully long since fixed). Anybody remembers? 

No, "cp" is still not a safe way to update libc, and I doubt it ever will be.
cp does _not_ create a new inode (unlike mv), so you are writing "garbage"
into the currently running executables.

NB: my open(2) says for O_CREAT "If the file does not exist it will be
created".  This is _not_ the same as O_EXCL, which will only mean that
open(2) will fail when it tries to create a new file.

Cheers, Andreas
-- 
Andreas Dilger   Turbolinux filesystem development
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-03 Thread Ishikawa
J. A. Magallon wrote:
>I'm not so sure it's only a 'rule of thumb'. Do not know the state of
>paging in just released 2.4.4, but in previuos kernel, a page that was
>paged-out, reserves its place in swap even if it is paged-in again, so
>once you have paged-out all your ram at least once, you can't get any
>more memory, even if swap is 'empty'.

I am not sure if I see the behavior of the kernel mentioned above, but
it seems that once the swap partition is used,
it may not be released forever : kernel 2.4.x.

Background:
I experienced a very strange X server crash (when a certain page
is access via netscape, and a string is searched successively,
on the second or the third search, the X server crash.)
I downloaded the Xfree 86 source file to re-create the problem and
see where the segmentation error of X server occurs by having a server
with debug symbols in it, and finally
traced it to an incorrect pointer value used in the backing store bitblt
routine.
However, I can't see the pointer value is an incorrect one at all.
(Maybe I am not familar enough with mmap and unmap calls invoked
by malloc() and free().)
Also, I was a little surprised to see a bug of this nature to remain
in X server at this stage.

Anyway, suspecting some sort of VM problem on the kernel side,
I wrote the attached program and run it to see
the behavior of the system when swap partition was first used.
(The situation where X server crashes is when the system is
just about to use Swap in my current config: 256 MB memory
and 80MB swap).

After running the few invocation of the attached program with
varying arguments, I noticed that once the swap is used (I view it
usinx xosview from X or cat /proc/meminfo ).
the used swap does not seem to get released!?

I thought it was suspicious, but didn't speak up since if this were
a bug, this would  indeed be a showstopper for many commercial
application.
Someone must have realized this and fixed it already, so I thought.

>From the current discussion, I notice that this may be indeed
an unwanted behavior.
If this is a bug of 2.4.x VM, then I think some installations
would appreciate it very much.


--- memhog.c

#include 
#include 
#include 

/*
 * using realloc() will call mremap().
 * using malloc() and free calls mmap, and munmap().
 */
static usage()
{
  fprintf(stderr,"memhog [limit]\nlimit is in MB units Default is 64
(MB).\n");
  exit(EXIT_FAILURE);
}

int main(argc, argv)
 int argc;
 char *argv[];
{

  int i;
  char *p = NULL;

#define MB (1024 * 1024 )
  int limit = 64 * MB;

  if (argc == 2)
{
  int t;
  t = atoi(argv[1]);
  if(t <= 0)
{
  usage();
}

  limit = t * MB;

}
  else if (argc >= 3)
{
  usage();
}


  i = MB;

  while (i <= limit )
{
  printf("i = %d\n", i);

#ifdef USE_REALLOC
  p = realloc(p, i);
#else
  p = malloc(i);
#endif

  if( p == NULL)
{
  printf("malloc returned NULL\n");
  exit(EXIT_FAILURE);
}

  memset(p, 0xA5, i);

  sleep (1);

#ifdef USE_REALLOC
  /* free(p); */
#else
  free(p);
#endif

  /* i *= 2; */
  i += 1 * MB;
}


  exit (EXIT_SUCCESS);

}




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.4 and 2GB swap partition limit

2001-05-03 Thread Ishikawa
J. A. Magallon wrote:
I'm not so sure it's only a 'rule of thumb'. Do not know the state of
paging in just released 2.4.4, but in previuos kernel, a page that was
paged-out, reserves its place in swap even if it is paged-in again, so
once you have paged-out all your ram at least once, you can't get any
more memory, even if swap is 'empty'.

I am not sure if I see the behavior of the kernel mentioned above, but
it seems that once the swap partition is used,
it may not be released forever : kernel 2.4.x.

Background:
I experienced a very strange X server crash (when a certain page
is access via netscape, and a string is searched successively,
on the second or the third search, the X server crash.)
I downloaded the Xfree 86 source file to re-create the problem and
see where the segmentation error of X server occurs by having a server
with debug symbols in it, and finally
traced it to an incorrect pointer value used in the backing store bitblt
routine.
However, I can't see the pointer value is an incorrect one at all.
(Maybe I am not familar enough with mmap and unmap calls invoked
by malloc() and free().)
Also, I was a little surprised to see a bug of this nature to remain
in X server at this stage.

Anyway, suspecting some sort of VM problem on the kernel side,
I wrote the attached program and run it to see
the behavior of the system when swap partition was first used.
(The situation where X server crashes is when the system is
just about to use Swap in my current config: 256 MB memory
and 80MB swap).

After running the few invocation of the attached program with
varying arguments, I noticed that once the swap is used (I view it
usinx xosview from X or cat /proc/meminfo ).
the used swap does not seem to get released!?

I thought it was suspicious, but didn't speak up since if this were
a bug, this would  indeed be a showstopper for many commercial
application.
Someone must have realized this and fixed it already, so I thought.

From the current discussion, I notice that this may be indeed
an unwanted behavior.
If this is a bug of 2.4.x VM, then I think some installations
would appreciate it very much.


--- memhog.c

#include stdio.h
#include string.h
#include stdlib.h

/*
 * using realloc() will call mremap().
 * using malloc() and free calls mmap, and munmap().
 */
static usage()
{
  fprintf(stderr,"memhog [limit]\nlimit is in MB units Default is 64
(MB).\n");
  exit(EXIT_FAILURE);
}

int main(argc, argv)
 int argc;
 char *argv[];
{

  int i;
  char *p = NULL;

#define MB (1024 * 1024 )
  int limit = 64 * MB;

  if (argc == 2)
{
  int t;
  t = atoi(argv[1]);
  if(t = 0)
{
  usage();
}

  limit = t * MB;

}
  else if (argc = 3)
{
  usage();
}


  i = MB;

  while (i = limit )
{
  printf("i = %d\n", i);

#ifdef USE_REALLOC
  p = realloc(p, i);
#else
  p = malloc(i);
#endif

  if( p == NULL)
{
  printf("malloc returned NULL\n");
  exit(EXIT_FAILURE);
}

  memset(p, 0xA5, i);

  sleep (1);

#ifdef USE_REALLOC
  /* free(p); */
#else
  free(p);
#endif

  /* i *= 2; */
  i += 1 * MB;
}


  exit (EXIT_SUCCESS);

}




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.4 and 2GB swap partition limit

2001-05-03 Thread Andreas Dilger

Andi Kleen writes:
 On Mon, Apr 30, 2001 at 12:07:44PM -0700, David S. Miller wrote:
 But something must have been not working with it for mmaps/shlibs 
 (not executables); at least historically.
 At least I remember that all hell broke lose when you tried to update
 libc by cp'ing a new one to /lib/libc.so in the 1.2 days. cp should create a 
 new inode (it uses O_CREAT) so in theory it should be coherent by the inode 
 reference; but somehow it didn't use to work and random already running 
 programs started to segfault. This was long ago. I wonder if old
 GNU cp used O_TRUNC instead of O_CREAT, or was there some other kernel bug 
 with mappings (hopefully long since fixed). Anybody remembers? 

No, cp is still not a safe way to update libc, and I doubt it ever will be.
cp does _not_ create a new inode (unlike mv), so you are writing garbage
into the currently running executables.

NB: my open(2) says for O_CREAT If the file does not exist it will be
created.  This is _not_ the same as O_EXCL, which will only mean that
open(2) will fail when it tries to create a new file.

Cheers, Andreas
-- 
Andreas Dilger   Turbolinux filesystem development
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-02 Thread Stephen C. Tweedie

Hi,

On Wed, May 02, 2001 at 01:49:16PM +0100, Hugh Dickins wrote:
> On Wed, 2 May 2001, Stephen C. Tweedie wrote:
> > 
> > So the aim is more complex.  Basically, once we are short on VM, we
> > want to eliminate redundant copies of swap data.  That implies two
> > possible actions, not one --- we can either remove the swap page for
> > data which is already in memory, or we can remove the in-memory copy
> > of data which is already on swap.  Which one is appropriate will
> > depend on whether the ptes in the system point to the swap entry or
> > the memory entry.  If we have ptes pointing to both, then we cannot
> > free either.
> 
> Sorry for stating the obvious, but that last sentence gives up too easily.
> If we have ptes pointing to both, then we cannot free either until we have
> replaced all the references to one by references to the other.

Sure, but it's far from obvious that we need to worry about this.  2.2
has exactly this same behaviour for shared pages, and so if people are
complaining about a 2.4 regression, this particular aspect of the
behaviour is clearly not the underlying problem.

--Stephen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-02 Thread Andi Kleen

On Mon, Apr 30, 2001 at 12:07:44PM -0700, David S. Miller wrote:
> Even more effective is:
> 
> mv /wherever/exeimage /usr/bin/exeimage
> 
> The kernel keeps around the contents of the old file while
> the executing process still runs.
> 
> This is also basically how things like libc get installed.
> A single mv is not only preserves currently referenced contents,
> it is atomic.

But something must have been not working with it for mmaps/shlibs 
(not executables); at least historically.
At least I remember that all hell broke lose when you tried to update
libc by cp'ing a new one to /lib/libc.so in the 1.2 days. cp should create a 
new inode (it uses O_CREAT) so in theory it should be coherent by the inode 
reference; but somehow it didn't use to work and random already running 
programs started to segfault. This was long ago. I wonder if old
GNU cp used O_TRUNC instead of O_CREAT, or was there some other kernel bug 
with mappings (hopefully long since fixed). Anybody remembers? 

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-02 Thread Hugh Dickins

On Wed, 2 May 2001, Stephen C. Tweedie wrote:
> 
> So the aim is more complex.  Basically, once we are short on VM, we
> want to eliminate redundant copies of swap data.  That implies two
> possible actions, not one --- we can either remove the swap page for
> data which is already in memory, or we can remove the in-memory copy
> of data which is already on swap.  Which one is appropriate will
> depend on whether the ptes in the system point to the swap entry or
> the memory entry.  If we have ptes pointing to both, then we cannot
> free either.

Sorry for stating the obvious, but that last sentence gives up too easily.
If we have ptes pointing to both, then we cannot free either until we have
replaced all the references to one by references to the other.

Hugh

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-02 Thread Stephen C. Tweedie

Hi,

On Wed, May 02, 2001 at 12:54:15PM +0200, Rogier Wolff wrote:
> 
> first: Thanks for clearing this up for me. 
> 
> So, there are in fact some more "states" a swap-page can be in:
> 
>   -(0) free
>   -(1) allocated, not in mem. 
>   -(2) on swap, valid copy of memory. 
>   -(3) on swap: invalid copy, allocated for fragmentation, can 
>   be freed on demand if we are close to running out of swap.
> 
> If we running low on (0) swap-pages we can first start to reap the (3)
> pages, and if that runs out, we can start reaping the (2)
> pages. Right?

Yes.  However, there is other state to worry about too.  Anonymous
pages are referenced from process page tables.  As long as the page
tables are referring to the copy in memory, you can free up the copy
on disk.  However, if any ptes point to the copy on disk, you cannot
(and remember, process forks can result in multiple process mm's
pointing to the same anonymous page, and some of those mm's may point
to swap while others point to the in-core page).

So the aim is more complex.  Basically, once we are short on VM, we
want to eliminate redundant copies of swap data.  That implies two
possible actions, not one --- we can either remove the swap page for
data which is already in memory, or we can remove the in-memory copy
of data which is already on swap.  Which one is appropriate will
depend on whether the ptes in the system point to the swap entry or
the memory entry.  If we have ptes pointing to both, then we cannot
free either.

Cheers,
 Stephen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-02 Thread Rogier Wolff

Stephen C. Tweedie wrote:
> Hi,
> 
> On Tue, May 01, 2001 at 06:14:54PM +0200, Rogier Wolff wrote:
> 
> > Shouldn't the algorithm be: 
> > 
> > - If (current_access == write )
> > free (swap_page);
> >   else
> > map (page, READONLY)
> > 
> > and 
> >   when a write access happens, we fault again, and map free the 
> >   swap-page as it is now dirty anyway. 
> 
> That's what 2.2 did.  2.4 doesn't have to. 
> 
> The trouble is, you really want contiguous virtual memory to remain
> contiguous on swap.  Freeing individual pages like this on fault can
> cause a great deal of fragmentation in swap.  We'd far rather keep the
> swap page reserved for future use by the same page so that the VM
> region remains contiguous on disk.
> 
> That's fine as far as it goes, but the problem happens if you _never_
> free up such pages.  We should reap the unused swap page if we run out
> of swap.  We don't, and _that_ is the problem --- not the fact that
> the page is left allocated in the first place, but the fact that we
> don't do anything about it once we are short on disk.

first: Thanks for clearing this up for me. 

So, there are in fact some more "states" a swap-page can be in:

-(0) free
-(1) allocated, not in mem. 
-(2) on swap, valid copy of memory. 
-(3) on swap: invalid copy, allocated for fragmentation, can 
be freed on demand if we are close to running out of swap.

If we running low on (0) swap-pages we can first start to reap the (3)
pages, and if that runs out, we can start reaping the (2)
pages. Right?


Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-02 Thread Stephen C. Tweedie

Hi,

On Tue, May 01, 2001 at 06:14:54PM +0200, Rogier Wolff wrote:

> Shouldn't the algorithm be: 
> 
> - If (current_access == write )
>   free (swap_page);
>   else
>   map (page, READONLY)
> 
> and 
>   when a write access happens, we fault again, and map free the 
>   swap-page as it is now dirty anyway. 

That's what 2.2 did.  2.4 doesn't have to. 

The trouble is, you really want contiguous virtual memory to remain
contiguous on swap.  Freeing individual pages like this on fault can
cause a great deal of fragmentation in swap.  We'd far rather keep the
swap page reserved for future use by the same page so that the VM
region remains contiguous on disk.

That's fine as far as it goes, but the problem happens if you _never_
free up such pages.  We should reap the unused swap page if we run out
of swap.  We don't, and _that_ is the problem --- not the fact that
the page is left allocated in the first place, but the fact that we
don't do anything about it once we are short on disk.

--Stephen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-02 Thread Stephen C. Tweedie

Hi,

On Tue, May 01, 2001 at 06:14:54PM +0200, Rogier Wolff wrote:

 Shouldn't the algorithm be: 
 
 - If (current_access == write )
   free (swap_page);
   else
   map (page, READONLY)
 
 and 
   when a write access happens, we fault again, and map free the 
   swap-page as it is now dirty anyway. 

That's what 2.2 did.  2.4 doesn't have to. 

The trouble is, you really want contiguous virtual memory to remain
contiguous on swap.  Freeing individual pages like this on fault can
cause a great deal of fragmentation in swap.  We'd far rather keep the
swap page reserved for future use by the same page so that the VM
region remains contiguous on disk.

That's fine as far as it goes, but the problem happens if you _never_
free up such pages.  We should reap the unused swap page if we run out
of swap.  We don't, and _that_ is the problem --- not the fact that
the page is left allocated in the first place, but the fact that we
don't do anything about it once we are short on disk.

--Stephen
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-02 Thread Rogier Wolff

Stephen C. Tweedie wrote:
 Hi,
 
 On Tue, May 01, 2001 at 06:14:54PM +0200, Rogier Wolff wrote:
 
  Shouldn't the algorithm be: 
  
  - If (current_access == write )
  free (swap_page);
else
  map (page, READONLY)
  
  and 
when a write access happens, we fault again, and map free the 
swap-page as it is now dirty anyway. 
 
 That's what 2.2 did.  2.4 doesn't have to. 
 
 The trouble is, you really want contiguous virtual memory to remain
 contiguous on swap.  Freeing individual pages like this on fault can
 cause a great deal of fragmentation in swap.  We'd far rather keep the
 swap page reserved for future use by the same page so that the VM
 region remains contiguous on disk.
 
 That's fine as far as it goes, but the problem happens if you _never_
 free up such pages.  We should reap the unused swap page if we run out
 of swap.  We don't, and _that_ is the problem --- not the fact that
 the page is left allocated in the first place, but the fact that we
 don't do anything about it once we are short on disk.

first: Thanks for clearing this up for me. 

So, there are in fact some more states a swap-page can be in:

-(0) free
-(1) allocated, not in mem. 
-(2) on swap, valid copy of memory. 
-(3) on swap: invalid copy, allocated for fragmentation, can 
be freed on demand if we are close to running out of swap.

If we running low on (0) swap-pages we can first start to reap the (3)
pages, and if that runs out, we can start reaping the (2)
pages. Right?


Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-02 Thread Stephen C. Tweedie

Hi,

On Wed, May 02, 2001 at 12:54:15PM +0200, Rogier Wolff wrote:
 
 first: Thanks for clearing this up for me. 
 
 So, there are in fact some more states a swap-page can be in:
 
   -(0) free
   -(1) allocated, not in mem. 
   -(2) on swap, valid copy of memory. 
   -(3) on swap: invalid copy, allocated for fragmentation, can 
   be freed on demand if we are close to running out of swap.
 
 If we running low on (0) swap-pages we can first start to reap the (3)
 pages, and if that runs out, we can start reaping the (2)
 pages. Right?

Yes.  However, there is other state to worry about too.  Anonymous
pages are referenced from process page tables.  As long as the page
tables are referring to the copy in memory, you can free up the copy
on disk.  However, if any ptes point to the copy on disk, you cannot
(and remember, process forks can result in multiple process mm's
pointing to the same anonymous page, and some of those mm's may point
to swap while others point to the in-core page).

So the aim is more complex.  Basically, once we are short on VM, we
want to eliminate redundant copies of swap data.  That implies two
possible actions, not one --- we can either remove the swap page for
data which is already in memory, or we can remove the in-memory copy
of data which is already on swap.  Which one is appropriate will
depend on whether the ptes in the system point to the swap entry or
the memory entry.  If we have ptes pointing to both, then we cannot
free either.

Cheers,
 Stephen
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-02 Thread Hugh Dickins

On Wed, 2 May 2001, Stephen C. Tweedie wrote:
 
 So the aim is more complex.  Basically, once we are short on VM, we
 want to eliminate redundant copies of swap data.  That implies two
 possible actions, not one --- we can either remove the swap page for
 data which is already in memory, or we can remove the in-memory copy
 of data which is already on swap.  Which one is appropriate will
 depend on whether the ptes in the system point to the swap entry or
 the memory entry.  If we have ptes pointing to both, then we cannot
 free either.

Sorry for stating the obvious, but that last sentence gives up too easily.
If we have ptes pointing to both, then we cannot free either until we have
replaced all the references to one by references to the other.

Hugh

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-02 Thread Stephen C. Tweedie

Hi,

On Wed, May 02, 2001 at 01:49:16PM +0100, Hugh Dickins wrote:
 On Wed, 2 May 2001, Stephen C. Tweedie wrote:
  
  So the aim is more complex.  Basically, once we are short on VM, we
  want to eliminate redundant copies of swap data.  That implies two
  possible actions, not one --- we can either remove the swap page for
  data which is already in memory, or we can remove the in-memory copy
  of data which is already on swap.  Which one is appropriate will
  depend on whether the ptes in the system point to the swap entry or
  the memory entry.  If we have ptes pointing to both, then we cannot
  free either.
 
 Sorry for stating the obvious, but that last sentence gives up too easily.
 If we have ptes pointing to both, then we cannot free either until we have
 replaced all the references to one by references to the other.

Sure, but it's far from obvious that we need to worry about this.  2.2
has exactly this same behaviour for shared pages, and so if people are
complaining about a 2.4 regression, this particular aspect of the
behaviour is clearly not the underlying problem.

--Stephen
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-02 Thread Andi Kleen

On Mon, Apr 30, 2001 at 12:07:44PM -0700, David S. Miller wrote:
 Even more effective is:
 
 mv /wherever/exeimage /usr/bin/exeimage
 
 The kernel keeps around the contents of the old file while
 the executing process still runs.
 
 This is also basically how things like libc get installed.
 A single mv is not only preserves currently referenced contents,
 it is atomic.

But something must have been not working with it for mmaps/shlibs 
(not executables); at least historically.
At least I remember that all hell broke lose when you tried to update
libc by cp'ing a new one to /lib/libc.so in the 1.2 days. cp should create a 
new inode (it uses O_CREAT) so in theory it should be coherent by the inode 
reference; but somehow it didn't use to work and random already running 
programs started to segfault. This was long ago. I wonder if old
GNU cp used O_TRUNC instead of O_CREAT, or was there some other kernel bug 
with mappings (hopefully long since fixed). Anybody remembers? 

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Rik van Riel

On Wed, 2 May 2001, Roger Larsson wrote:
> On Wednesday 02 May 2001 02:43, Rik van Riel wrote:
> > On Tue, 1 May 2001, David S. Miller wrote:
> > > Rik van Riel writes:
> > >  > Then we will be scanning through memory looking for something to
> > >  > swap out (otherwise we'd not be in need of swap space, right?).
> > >  > At this point we can simply free up swap entries while scanning

> We could reclaim swap space for dirty pages. They have to be
> rewritten anyway...
> 
> Or would the fragmentation risk be too high?

I guess the fragmentation would get worse. Also, when you've got
more than enough swap left there's absolutely no reason to reclaim
the swap when pages get dirtied.

I just guess we should start reclaiming swap space as soon as the
free swap space gets below, say, 5%. We could go for a more complex
system of only reclaiming the swap space of dirty pages from a lower
threshold on, but I doubt that's worth the complexity.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Roger Larsson

On Wednesday 02 May 2001 02:43, Rik van Riel wrote:
> On Tue, 1 May 2001, David S. Miller wrote:
> > Rik van Riel writes:
> >  > Then we will be scanning through memory looking for something to
> >  > swap out (otherwise we'd not be in need of swap space, right?).
> >  > At this point we can simply free up swap entries while scanning
> >  > through memory looking for stuff to swap out.
> >
> > Sounds a lot like my patch I posted a few weeks ago:
>
> Not really. Your patch only reclaims swap cache pages that
> hang around after a process exit()s. What I want to do is
> reclaim swap space of pages which have been swapped in so
> we can use that swap space to swap something else out.
>

We could reclaim swap space for dirty pages. They have to be
rewritten anyway...

Or would the fragmentation risk be too high?

/RogerL

-- 
Roger Larsson
Skellefteå
Sweden
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Rik van Riel

On Tue, 1 May 2001, David S. Miller wrote:
> Rik van Riel writes:
>  > Then we will be scanning through memory looking for something to
>  > swap out (otherwise we'd not be in need of swap space, right?).
>  > At this point we can simply free up swap entries while scanning
>  > through memory looking for stuff to swap out.
> 
> Sounds a lot like my patch I posted a few weeks ago:

Not really. Your patch only reclaims swap cache pages that
hang around after a process exit()s. What I want to do is
reclaim swap space of pages which have been swapped in so
we can use that swap space to swap something else out.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread David S. Miller


Rik van Riel writes:
 > Then we will be scanning through memory looking for something to
 > swap out (otherwise we'd not be in need of swap space, right?).
 > At this point we can simply free up swap entries while scanning
 > through memory looking for stuff to swap out.

Sounds a lot like my patch I posted a few weeks ago:

--- mm/vmscan.c.~1~ Wed Apr 11 19:05:59 2001
+++ mm/vmscan.c Thu Apr 12 17:19:51 2001
@@ -441,8 +441,14 @@
maxscan = nr_inactive_dirty_pages;
while ((page_lru = inactive_dirty_list.prev) != _dirty_list &&
maxscan-- > 0) {
+   int dead_swap_page;
+
page = list_entry(page_lru, struct page, lru);
 
+   dead_swap_page =
+   (PageSwapCache(page) &&
+page_count(page) == (1 + !!page->buffers));
+
/* Wrong page on list?! (list corruption, should not happen) */
if (!PageInactiveDirty(page)) {
printk("VM: page_launder, wrong page on list.\n");
@@ -453,9 +459,10 @@
}
 
/* Page is or was in use?  Move it to the active list. */
-   if (PageTestandClearReferenced(page) || page->age > 0 ||
-   (!page->buffers && page_count(page) > 1) ||
-   page_ramdisk(page)) {
+   if (!dead_swap_page &&
+   (PageTestandClearReferenced(page) || page->age > 0 ||
+(!page->buffers && page_count(page) > 1) ||
+page_ramdisk(page))) {
del_page_from_inactive_dirty_list(page);
add_page_to_active_list(page);
continue;
@@ -481,8 +488,11 @@
if (!writepage)
goto page_active;
 
-   /* First time through? Move it to the back of the list */
-   if (!launder_loop) {
+   /* First time through? Move it to the back of the list,
+* but not if it is a dead swap page. We want to reap
+* those as fast as possible.
+*/
+   if (!launder_loop && !dead_swap_page) {
list_del(page_lru);
list_add(page_lru, _dirty_list);
UnlockPage(page);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Rik van Riel

On Tue, 1 May 2001, Stephen C. Tweedie wrote:

> The right fix is to reclaim such pages only when we need to.  To
> disable swap caching when we still have enough swap free would hurt
> users who have the spare swap to cope with it.

That's easy enough. When we are:
1. almost out of swap and
2. need swap space

Then we will be scanning through memory looking for something to
swap out (otherwise we'd not be in need of swap space, right?).
At this point we can simply free up swap entries while scanning
through memory looking for stuff to swap out.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Rik van Riel

On 1 May 2001, Christoph Rohland wrote:
> On Mon, 30 Apr 2001, Alan Cox wrote:
> >> paging in just released 2.4.4, but in previuos kernel, a page that
> >> was paged-out, reserves its place in swap even if it is paged-in
> >> again, so once you have paged-out all your ram at least once, you
> >> can't get any more memory, even if swap is 'empty'.
> > 
> > This is a bug in the 2.4 VM, nothing more or less. It and the
> > horrible bounce buffer bugs are forcing large machines to remain on
> > 2.2. So it has to get fixed
> 
> Yes, it is a bug. and thanks for stating this so clearly.

I finished my USENIX paper (yeah I know, exactly on the deadline ;))
so now I have time again. This bug is one of the more important ones
on my TODO list.

Let me see if I can cook up a race-free way of fixing this one.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Rogier Wolff

Stephen C. Tweedie wrote:
> Hi,
> 
> On Mon, Apr 30, 2001 at 07:12:12PM +0100, Alan Cox wrote:
> > > paging in just released 2.4.4, but in previuos kernel, a page that was
> > > paged-out, reserves its place in swap even if it is paged-in again, so
> > > once you have paged-out all your ram at least once, you can't get any
> > > more memory, even if swap is 'empty'.
> > 
> > This is a bug in the 2.4 VM, nothing more or less. It and the horrible bounce
> > buffer bugs are forcing large machines to remain on 2.2. So it has to get 
> > fixed
> 
> Umm, 2.2 can behave in the same way.  The only difference in the 2.4
> behaviour is that 2.4 can maintain the swap cache effect for dirty
> pages as well as clean ones.  An application which creates a large
> in-core data set and then does not modify it will show exactly the
> same behaviour on 2.2.
> 
> To call it a "bug" is to imply that "fixing it" is the right thing to
> do.  It might be in some cases, but discarding the swap entry has a
> cost --- you fragment swap, and if the page in memory is clean, you
> end up increasing the amount of swap IO.  

Shouldn't the algorithm be: 

- If (current_access == write )
free (swap_page);
  else
map (page, READONLY)

and 
  when a write access happens, we fault again, and map free the 
  swap-page as it is now dirty anyway. 

The extra fault when the first access is a read (e.g. in a
read-modify-write case) will seem like a lot of overhead, but if you
save an IO in 1% of the cases, it will be worth it...

Is swap allocated "congruently"? I didn't think so. That would mean
that if one page of a process has to be paged out, you allocate an
area on the swap that maps directly on the whole adressing space of
that process. That way prefetching stuff from swap can actually help a
lot.

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Stephen C. Tweedie

Hi,

On Mon, Apr 30, 2001 at 07:12:12PM +0100, Alan Cox wrote:
> > paging in just released 2.4.4, but in previuos kernel, a page that was
> > paged-out, reserves its place in swap even if it is paged-in again, so
> > once you have paged-out all your ram at least once, you can't get any
> > more memory, even if swap is 'empty'.
> 
> This is a bug in the 2.4 VM, nothing more or less. It and the horrible bounce
> buffer bugs are forcing large machines to remain on 2.2. So it has to get 
> fixed

Umm, 2.2 can behave in the same way.  The only difference in the 2.4
behaviour is that 2.4 can maintain the swap cache effect for dirty
pages as well as clean ones.  An application which creates a large
in-core data set and then does not modify it will show exactly the
same behaviour on 2.2.

To call it a "bug" is to imply that "fixing it" is the right thing to
do.  It might be in some cases, but discarding the swap entry has a
cost --- you fragment swap, and if the page in memory is clean, you
end up increasing the amount of swap IO.  

The right fix is to reclaim such pages only when we need to.  To
disable swap caching when we still have enough swap free would hurt
users who have the spare swap to cope with it.

--Stephen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Christoph Rohland

Hi Alan,

On Mon, 30 Apr 2001, Alan Cox wrote:
>> paging in just released 2.4.4, but in previuos kernel, a page that
>> was paged-out, reserves its place in swap even if it is paged-in
>> again, so once you have paged-out all your ram at least once, you
>> can't get any more memory, even if swap is 'empty'.
> 
> This is a bug in the 2.4 VM, nothing more or less. It and the
> horrible bounce buffer bugs are forcing large machines to remain on
> 2.2. So it has to get fixed

Yes, it is a bug. and thanks for stating this so clearly.

But a lot of the big servers can go to 2.4. because SYSV shm/shm
fs/tmpfs will reclaim the swap entries on swapin. So big databases and
applications servers which rely on shm are not affected.

Greetings
Christoph


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Christoph Rohland

Hi Alan,

On Mon, 30 Apr 2001, Alan Cox wrote:
 paging in just released 2.4.4, but in previuos kernel, a page that
 was paged-out, reserves its place in swap even if it is paged-in
 again, so once you have paged-out all your ram at least once, you
 can't get any more memory, even if swap is 'empty'.
 
 This is a bug in the 2.4 VM, nothing more or less. It and the
 horrible bounce buffer bugs are forcing large machines to remain on
 2.2. So it has to get fixed

Yes, it is a bug. and thanks for stating this so clearly.

But a lot of the big servers can go to 2.4. because SYSV shm/shm
fs/tmpfs will reclaim the swap entries on swapin. So big databases and
applications servers which rely on shm are not affected.

Greetings
Christoph


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Stephen C. Tweedie

Hi,

On Mon, Apr 30, 2001 at 07:12:12PM +0100, Alan Cox wrote:
  paging in just released 2.4.4, but in previuos kernel, a page that was
  paged-out, reserves its place in swap even if it is paged-in again, so
  once you have paged-out all your ram at least once, you can't get any
  more memory, even if swap is 'empty'.
 
 This is a bug in the 2.4 VM, nothing more or less. It and the horrible bounce
 buffer bugs are forcing large machines to remain on 2.2. So it has to get 
 fixed

Umm, 2.2 can behave in the same way.  The only difference in the 2.4
behaviour is that 2.4 can maintain the swap cache effect for dirty
pages as well as clean ones.  An application which creates a large
in-core data set and then does not modify it will show exactly the
same behaviour on 2.2.

To call it a bug is to imply that fixing it is the right thing to
do.  It might be in some cases, but discarding the swap entry has a
cost --- you fragment swap, and if the page in memory is clean, you
end up increasing the amount of swap IO.  

The right fix is to reclaim such pages only when we need to.  To
disable swap caching when we still have enough swap free would hurt
users who have the spare swap to cope with it.

--Stephen
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Rogier Wolff

Stephen C. Tweedie wrote:
 Hi,
 
 On Mon, Apr 30, 2001 at 07:12:12PM +0100, Alan Cox wrote:
   paging in just released 2.4.4, but in previuos kernel, a page that was
   paged-out, reserves its place in swap even if it is paged-in again, so
   once you have paged-out all your ram at least once, you can't get any
   more memory, even if swap is 'empty'.
  
  This is a bug in the 2.4 VM, nothing more or less. It and the horrible bounce
  buffer bugs are forcing large machines to remain on 2.2. So it has to get 
  fixed
 
 Umm, 2.2 can behave in the same way.  The only difference in the 2.4
 behaviour is that 2.4 can maintain the swap cache effect for dirty
 pages as well as clean ones.  An application which creates a large
 in-core data set and then does not modify it will show exactly the
 same behaviour on 2.2.
 
 To call it a bug is to imply that fixing it is the right thing to
 do.  It might be in some cases, but discarding the swap entry has a
 cost --- you fragment swap, and if the page in memory is clean, you
 end up increasing the amount of swap IO.  

Shouldn't the algorithm be: 

- If (current_access == write )
free (swap_page);
  else
map (page, READONLY)

and 
  when a write access happens, we fault again, and map free the 
  swap-page as it is now dirty anyway. 

The extra fault when the first access is a read (e.g. in a
read-modify-write case) will seem like a lot of overhead, but if you
save an IO in 1% of the cases, it will be worth it...

Is swap allocated congruently? I didn't think so. That would mean
that if one page of a process has to be paged out, you allocate an
area on the swap that maps directly on the whole adressing space of
that process. That way prefetching stuff from swap can actually help a
lot.

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Rik van Riel

On 1 May 2001, Christoph Rohland wrote:
 On Mon, 30 Apr 2001, Alan Cox wrote:
  paging in just released 2.4.4, but in previuos kernel, a page that
  was paged-out, reserves its place in swap even if it is paged-in
  again, so once you have paged-out all your ram at least once, you
  can't get any more memory, even if swap is 'empty'.
  
  This is a bug in the 2.4 VM, nothing more or less. It and the
  horrible bounce buffer bugs are forcing large machines to remain on
  2.2. So it has to get fixed
 
 Yes, it is a bug. and thanks for stating this so clearly.

I finished my USENIX paper (yeah I know, exactly on the deadline ;))
so now I have time again. This bug is one of the more important ones
on my TODO list.

Let me see if I can cook up a race-free way of fixing this one.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Rik van Riel

On Tue, 1 May 2001, Stephen C. Tweedie wrote:

 The right fix is to reclaim such pages only when we need to.  To
 disable swap caching when we still have enough swap free would hurt
 users who have the spare swap to cope with it.

That's easy enough. When we are:
1. almost out of swap and
2. need swap space

Then we will be scanning through memory looking for something to
swap out (otherwise we'd not be in need of swap space, right?).
At this point we can simply free up swap entries while scanning
through memory looking for stuff to swap out.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Rik van Riel

On Tue, 1 May 2001, David S. Miller wrote:
 Rik van Riel writes:
   Then we will be scanning through memory looking for something to
   swap out (otherwise we'd not be in need of swap space, right?).
   At this point we can simply free up swap entries while scanning
   through memory looking for stuff to swap out.
 
 Sounds a lot like my patch I posted a few weeks ago:

Not really. Your patch only reclaims swap cache pages that
hang around after a process exit()s. What I want to do is
reclaim swap space of pages which have been swapped in so
we can use that swap space to swap something else out.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Rik van Riel

On Wed, 2 May 2001, Roger Larsson wrote:
 On Wednesday 02 May 2001 02:43, Rik van Riel wrote:
  On Tue, 1 May 2001, David S. Miller wrote:
   Rik van Riel writes:
 Then we will be scanning through memory looking for something to
 swap out (otherwise we'd not be in need of swap space, right?).
 At this point we can simply free up swap entries while scanning

 We could reclaim swap space for dirty pages. They have to be
 rewritten anyway...
 
 Or would the fragmentation risk be too high?

I guess the fragmentation would get worse. Also, when you've got
more than enough swap left there's absolutely no reason to reclaim
the swap when pages get dirtied.

I just guess we should start reclaiming swap space as soon as the
free swap space gets below, say, 5%. We could go for a more complex
system of only reclaiming the swap space of dirty pages from a lower
threshold on, but I doubt that's worth the complexity.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread David S. Miller


Rik van Riel writes:
  Then we will be scanning through memory looking for something to
  swap out (otherwise we'd not be in need of swap space, right?).
  At this point we can simply free up swap entries while scanning
  through memory looking for stuff to swap out.

Sounds a lot like my patch I posted a few weeks ago:

--- mm/vmscan.c.~1~ Wed Apr 11 19:05:59 2001
+++ mm/vmscan.c Thu Apr 12 17:19:51 2001
@@ -441,8 +441,14 @@
maxscan = nr_inactive_dirty_pages;
while ((page_lru = inactive_dirty_list.prev) != inactive_dirty_list 
maxscan--  0) {
+   int dead_swap_page;
+
page = list_entry(page_lru, struct page, lru);
 
+   dead_swap_page =
+   (PageSwapCache(page) 
+page_count(page) == (1 + !!page-buffers));
+
/* Wrong page on list?! (list corruption, should not happen) */
if (!PageInactiveDirty(page)) {
printk(VM: page_launder, wrong page on list.\n);
@@ -453,9 +459,10 @@
}
 
/* Page is or was in use?  Move it to the active list. */
-   if (PageTestandClearReferenced(page) || page-age  0 ||
-   (!page-buffers  page_count(page)  1) ||
-   page_ramdisk(page)) {
+   if (!dead_swap_page 
+   (PageTestandClearReferenced(page) || page-age  0 ||
+(!page-buffers  page_count(page)  1) ||
+page_ramdisk(page))) {
del_page_from_inactive_dirty_list(page);
add_page_to_active_list(page);
continue;
@@ -481,8 +488,11 @@
if (!writepage)
goto page_active;
 
-   /* First time through? Move it to the back of the list */
-   if (!launder_loop) {
+   /* First time through? Move it to the back of the list,
+* but not if it is a dead swap page. We want to reap
+* those as fast as possible.
+*/
+   if (!launder_loop  !dead_swap_page) {
list_del(page_lru);
list_add(page_lru, inactive_dirty_list);
UnlockPage(page);
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-05-01 Thread Roger Larsson

On Wednesday 02 May 2001 02:43, Rik van Riel wrote:
 On Tue, 1 May 2001, David S. Miller wrote:
  Rik van Riel writes:
Then we will be scanning through memory looking for something to
swap out (otherwise we'd not be in need of swap space, right?).
At this point we can simply free up swap entries while scanning
through memory looking for stuff to swap out.
 
  Sounds a lot like my patch I posted a few weeks ago:

 Not really. Your patch only reclaims swap cache pages that
 hang around after a process exit()s. What I want to do is
 reclaim swap space of pages which have been swapped in so
 we can use that swap space to swap something else out.


We could reclaim swap space for dirty pages. They have to be
rewritten anyway...

Or would the fragmentation risk be too high?

/RogerL

-- 
Roger Larsson
Skellefteå
Sweden
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-30 Thread Andreas Ferber

Hi,

On Mon, Apr 30, 2001 at 03:14:25PM -0400, Richard B. Johnson wrote:

> > mv /wherever/exeimage /usr/bin/exeimage
[...]
> > This is also basically how things like libc get installed.
> > A single mv is not only preserves currently referenced contents,
> > it is atomic.

One restriction: /wherever and /usr/bin must be on the same partition,
otherwise the mv will be the same as "rm /usr/bin/exeimage && cp
/wherever/exeimage /usr/bin/exeimage && rm /wherever/exeimage", which
is for sure not atomic.

> Sure, but now you can't get back if the new software doesn't run.
> This is why I recommended the two steps and cautioned about testing
> the new stuff first.

Then create a hardlink first, to keep a backup _and_ atomically
replace the file.

Andreas
-- 
Build a system that even a fool can use and only a fool will want to use it.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: 2.4 and 2GB swap partition limit

2001-04-30 Thread Richard B. Johnson

On Mon, 30 Apr 2001, David S. Miller wrote:

> 
> Richard B. Johnson writes:
>  > On Mon, 30 Apr 2001, Torrey Hoffman wrote:
>  > > In general, is there a safe way to replace executable files for
>  > > programs that might be running while their on-disk images are
>  > > replaced?
>  > 
>  > Yes. Perfectly safe:
>  > 
>  > mv /usr/bin/exeimage /usr/bin/exeimage.sav
>  > cp /wherever/exeimage /usr/bin/exeimage
>  > 
>  > 
>  > The executing task will continue to use the old image until it exits.
> 
> Even more effective is:
> 
> mv /wherever/exeimage /usr/bin/exeimage
> 
> The kernel keeps around the contents of the old file while
> the executing process still runs.
> 
> This is also basically how things like libc get installed.
> A single mv is not only preserves currently referenced contents,
> it is atomic.
> 
> Later,
> David S. Miller
> [EMAIL PROTECTED]

Sure, but now you can't get back if the new software doesn't run.
This is why I recommended the two steps and cautioned about testing
the new stuff first.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: 2.4 and 2GB swap partition limit

2001-04-30 Thread David S. Miller


Richard B. Johnson writes:
 > On Mon, 30 Apr 2001, Torrey Hoffman wrote:
 > > In general, is there a safe way to replace executable files for
 > > programs that might be running while their on-disk images are
 > > replaced?
 > 
 > Yes. Perfectly safe:
 > 
 > mv /usr/bin/exeimage /usr/bin/exeimage.sav
 > cp /wherever/exeimage /usr/bin/exeimage
 > 
 > 
 > The executing task will continue to use the old image until it exits.

Even more effective is:

mv /wherever/exeimage /usr/bin/exeimage

The kernel keeps around the contents of the old file while
the executing process still runs.

This is also basically how things like libc get installed.
A single mv is not only preserves currently referenced contents,
it is atomic.

Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: 2.4 and 2GB swap partition limit

2001-04-30 Thread Richard B. Johnson

On Mon, 30 Apr 2001, Torrey Hoffman wrote:

> 
> In general, is there a safe way to replace executable files for
> programs that might be running while their on-disk images are
> replaced?
> 

Yes. Perfectly safe:

mv /usr/bin/exeimage /usr/bin/exeimage.sav
cp /wherever/exeimage /usr/bin/exeimage


The executing task will continue to use the old image until it exits.
New tasks will use the new image. You can even replace `mv` and `cp`
this way. It is best to test new programs first, though, so you
know that it's linked with a runtime library that you have on your
system.

Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: 2.4 and 2GB swap partition limit

2001-04-30 Thread Torrey Hoffman



Kenneth Johansson wrote:
> Jonathan Lundell wrote:
> >
> > (Does Linux swap out text, by the way, he asks ignorantly?)
> 
> .text is just droped and read back from the actuall file it's 
> not put into the swap

Is this always true, even for init?  Can init be swapped out?  

In general, is there a safe way to replace executable files for
programs that might be running while their on-disk images are
replaced?

Thanks for any hints...

Torrey Hoffman
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-30 Thread Alan Cox

> > The swap I have is 2 partitions, one on each drive both with a priority of
> > 0.  Personally, I like the way it's done on my box.
> 
> So you've spent almost $200 for RAM, and refuse to spend $4 for 1Gb of
> swap space. Fine with me. 

Stupid argument. Very stupid argument. Take a 16Gb server. You now want
to buy 64Gb of hard disk for the swap. Only because of partition limits you'll
beed at least 2 disks entirely dedicated to it, which also means a controller
a larger PSU and a bigger case.

The swap behaviour of 2.4 is a bug. 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-30 Thread Alan Cox

> paging in just released 2.4.4, but in previuos kernel, a page that was
> paged-out, reserves its place in swap even if it is paged-in again, so
> once you have paged-out all your ram at least once, you can't get any
> more memory, even if swap is 'empty'.

This is a bug in the 2.4 VM, nothing more or less. It and the horrible bounce
buffer bugs are forcing large machines to remain on 2.2. So it has to get 
fixed


Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-30 Thread Alan Cox

 paging in just released 2.4.4, but in previuos kernel, a page that was
 paged-out, reserves its place in swap even if it is paged-in again, so
 once you have paged-out all your ram at least once, you can't get any
 more memory, even if swap is 'empty'.

This is a bug in the 2.4 VM, nothing more or less. It and the horrible bounce
buffer bugs are forcing large machines to remain on 2.2. So it has to get 
fixed


Alan

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-30 Thread Alan Cox

  The swap I have is 2 partitions, one on each drive both with a priority of
  0.  Personally, I like the way it's done on my box.
 
 So you've spent almost $200 for RAM, and refuse to spend $4 for 1Gb of
 swap space. Fine with me. 

Stupid argument. Very stupid argument. Take a 16Gb server. You now want
to buy 64Gb of hard disk for the swap. Only because of partition limits you'll
beed at least 2 disks entirely dedicated to it, which also means a controller
a larger PSU and a bigger case.

The swap behaviour of 2.4 is a bug. 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: 2.4 and 2GB swap partition limit

2001-04-30 Thread Torrey Hoffman



Kenneth Johansson wrote:
 Jonathan Lundell wrote:
 
  (Does Linux swap out text, by the way, he asks ignorantly?)
 
 .text is just droped and read back from the actuall file it's 
 not put into the swap

Is this always true, even for init?  Can init be swapped out?  

In general, is there a safe way to replace executable files for
programs that might be running while their on-disk images are
replaced?

Thanks for any hints...

Torrey Hoffman
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: 2.4 and 2GB swap partition limit

2001-04-30 Thread Richard B. Johnson

On Mon, 30 Apr 2001, Torrey Hoffman wrote:

 
 In general, is there a safe way to replace executable files for
 programs that might be running while their on-disk images are
 replaced?
 

Yes. Perfectly safe:

mv /usr/bin/exeimage /usr/bin/exeimage.sav
cp /wherever/exeimage /usr/bin/exeimage


The executing task will continue to use the old image until it exits.
New tasks will use the new image. You can even replace `mv` and `cp`
this way. It is best to test new programs first, though, so you
know that it's linked with a runtime library that you have on your
system.

Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot...; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: 2.4 and 2GB swap partition limit

2001-04-30 Thread David S. Miller


Richard B. Johnson writes:
  On Mon, 30 Apr 2001, Torrey Hoffman wrote:
   In general, is there a safe way to replace executable files for
   programs that might be running while their on-disk images are
   replaced?
  
  Yes. Perfectly safe:
  
  mv /usr/bin/exeimage /usr/bin/exeimage.sav
  cp /wherever/exeimage /usr/bin/exeimage
  
  
  The executing task will continue to use the old image until it exits.

Even more effective is:

mv /wherever/exeimage /usr/bin/exeimage

The kernel keeps around the contents of the old file while
the executing process still runs.

This is also basically how things like libc get installed.
A single mv is not only preserves currently referenced contents,
it is atomic.

Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: 2.4 and 2GB swap partition limit

2001-04-30 Thread Richard B. Johnson

On Mon, 30 Apr 2001, David S. Miller wrote:

 
 Richard B. Johnson writes:
   On Mon, 30 Apr 2001, Torrey Hoffman wrote:
In general, is there a safe way to replace executable files for
programs that might be running while their on-disk images are
replaced?
   
   Yes. Perfectly safe:
   
   mv /usr/bin/exeimage /usr/bin/exeimage.sav
   cp /wherever/exeimage /usr/bin/exeimage
   
   
   The executing task will continue to use the old image until it exits.
 
 Even more effective is:
 
 mv /wherever/exeimage /usr/bin/exeimage
 
 The kernel keeps around the contents of the old file while
 the executing process still runs.
 
 This is also basically how things like libc get installed.
 A single mv is not only preserves currently referenced contents,
 it is atomic.
 
 Later,
 David S. Miller
 [EMAIL PROTECTED]

Sure, but now you can't get back if the new software doesn't run.
This is why I recommended the two steps and cautioned about testing
the new stuff first.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot...; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-30 Thread Andreas Ferber

Hi,

On Mon, Apr 30, 2001 at 03:14:25PM -0400, Richard B. Johnson wrote:

  mv /wherever/exeimage /usr/bin/exeimage
[...]
  This is also basically how things like libc get installed.
  A single mv is not only preserves currently referenced contents,
  it is atomic.

One restriction: /wherever and /usr/bin must be on the same partition,
otherwise the mv will be the same as rm /usr/bin/exeimage  cp
/wherever/exeimage /usr/bin/exeimage  rm /wherever/exeimage, which
is for sure not atomic.

 Sure, but now you can't get back if the new software doesn't run.
 This is why I recommended the two steps and cautioned about testing
 the new stuff first.

Then create a hardlink first, to keep a backup _and_ atomically
replace the file.

Andreas
-- 
Build a system that even a fool can use and only a fool will want to use it.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-29 Thread Ingo Oeser

On Fri, Apr 27, 2001 at 11:40:40PM +0100, Hugh Dickins wrote:
> > > An interesting option (though with less-than-stellar performance
> > > characteristics) would be a dynamically expanding swapfile.  If you're
> > > going to be hit with swap penalties, it may be useful to not have to
> > > pre-reserve something you only hit once in a great while.
> > This makes amazingly little sense since you'd still need to
> > pre-reserve the disk space the swapfile grows into.
> It makes roughly the same sense as over-committing memory.
> Both are useful, both are unreliable.

And we have the one, so we should also implement the other one to
be totally unreliable.

*gd*

Ingo Oeser
-- 
10.+11.03.2001 - 3. Chemnitzer LinuxTag 
  been there and had much fun   
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-29 Thread Kenneth Johansson

Jonathan Lundell wrote:

>
> (Does Linux swap out text, by the way, he asks ignorantly?)

.text is just droped and read back from the actuall file it's not put into the swap

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-29 Thread Kenneth Johansson

Jonathan Lundell wrote:


 (Does Linux swap out text, by the way, he asks ignorantly?)

.text is just droped and read back from the actuall file it's not put into the swap

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-29 Thread Ingo Oeser

On Fri, Apr 27, 2001 at 11:40:40PM +0100, Hugh Dickins wrote:
   An interesting option (though with less-than-stellar performance
   characteristics) would be a dynamically expanding swapfile.  If you're
   going to be hit with swap penalties, it may be useful to not have to
   pre-reserve something you only hit once in a great while.
  This makes amazingly little sense since you'd still need to
  pre-reserve the disk space the swapfile grows into.
 It makes roughly the same sense as over-committing memory.
 Both are useful, both are unreliable.

And we have the one, so we should also implement the other one to
be totally unreliable.

*gdr*

Ingo Oeser
-- 
10.+11.03.2001 - 3. Chemnitzer LinuxTag http://www.tu-chemnitz.de/linux/tag
  been there and had much fun   
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread Jonathan Lundell

At 2:04 PM -0400 2001-04-28, Albert D. Cahalan wrote:
>It is a disaster waiting to happen. Instead of having the offending
>process get killed, your machine could suffer extreme thrashing.
>
>Have enough swap for idle processes and no more.

Let's altogether now say "working set".

(Does Linux swap out text, by the way, he asks ignorantly?)
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread Rogier Wolff

David Lang wrote:
> at the low end useing a bit of disk for swap doesn't hurt, I ran into a
> case a couple years ago on AIX systems. we buy them with 2G ram so that we
> don't need to swap, but discovered (the hard way) that we also needed to
> allocate 4G of disk space for those boxes (allocating less then 2G meant
> that we couldn't use the 2G of RAM). This meant that we had to go out and
> buy 2nd hard drives for every machine, just to put the swap files on.
> 
> now disks are larger today so it's not as much of an issue, but even with
> modern 9-18G drives you can end up eating up 20% or more on a large
> machine, this starts to be significant. you can try to say that any box
> with that much ram must have lots of disk as well, and most of the time
> you will be right, but not all the time. there are cases (webservers for
> example) where the machines will be built with lots of RAM and CPU and
> little disk becouse they get all their content and put all their logs
> elsewhere on the network. in fact with the advances in flash size and the
> desire to create high performance clusters, I would not be surprised to
> see web node machines produced with no hard drives. it means one less
> thing that can break on the system (think a rack of transmeta powered
> boxes with no moving parts in the rack except possibly fans)

On Sat, 28 Apr 2001 [EMAIL PROTECTED] wrote:

> > I've ALWAYS said that it's a rule-of-thumb. This means that if you
> > have a good argument to do it differently, you should surely do so!

Roger. 


-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread Rogier Wolff

Albert D. Cahalan wrote:
> So that is a factor of 50 in price. It's what, a factor of 100
> in access time?

Actually it's only about 10. 

> > That disk space is just sitting there. Never to be used. I spent $400
> > on the RAM, and I'm now reserving about $8 worth of disk space for
> > swap. I think that the $8 is well worth it. It keeps my machine
> > functional a while longer should something go haywire... As I said:
> > If you don't want to see it that way: Fine with me. 
> 
> It is a disaster waiting to happen. Instead of having the offending
> process get killed, your machine could suffer extreme thrashing.
> 
> Have enough swap for idle processes and no more.

Right. Now there is some time where "extreme thrashing" will alert ME
(a human, I think) to try and find/kill the offending process.

Otherwise I have to trust Rik's OOM killer. Now his OOM killer isn't
all that bad. But it isn't human. Humans are better at actually
finding the real CAUSE. An OOM killer might hit one or two innocent
processes along the way. So far I've killed the right process ALL the
time. I can't say the same for the OOM killer.

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread David Lang

at the low end useing a bit of disk for swap doesn't hurt, I ran into a
case a couple years ago on AIX systems. we buy them with 2G ram so that we
don't need to swap, but discovered (the hard way) that we also needed to
allocate 4G of disk space for those boxes (allocating less then 2G meant
that we couldn't use the 2G of RAM). This meant that we had to go out and
buy 2nd hard drives for every machine, just to put the swap files on.

now disks are larger today so it's not as much of an issue, but even with
modern 9-18G drives you can end up eating up 20% or more on a large
machine, this starts to be significant. you can try to say that any box
with that much ram must have lots of disk as well, and most of the time
you will be right, but not all the time. there are cases (webservers for
example) where the machines will be built with lots of RAM and CPU and
little disk becouse they get all their content and put all their logs
elsewhere on the network. in fact with the advances in flash size and the
desire to create high performance clusters, I would not be surprised to
see web node machines produced with no hard drives. it means one less
thing that can break on the system (think a rack of transmeta powered
boxes with no moving parts in the rack except possibly fans)

David Lang

 On
Sat, 28 Apr 2001 [EMAIL PROTECTED] wrote:

> Date: Sat, 28 Apr 2001 16:11:47 +0200 (MEST)
> From: [EMAIL PROTECTED]
> To: Wakko Warner <[EMAIL PROTECTED]>
> Cc: Rogier Wolff <[EMAIL PROTECTED]>,
>  Xavier Bestel <[EMAIL PROTECTED]>,
>  Goswin Brederlow <[EMAIL PROTECTED]>,
>  William T Wilson <[EMAIL PROTECTED]>, [EMAIL PROTECTED],
>  [EMAIL PROTECTED]
> Subject: Re: 2.4 and 2GB swap partition limit
>
> Wakko Warner wrote:
> > > So you've spent almost $200 for RAM, and refuse to spend $4 for 1Gb of
> > > swap space. Fine with me.
>
> > I put this much ram into the system to keep from having swap.  I
> > still say swap=2x ram is a stupid idea.  I fail to see the logic in
> > that.  Disk is much much slower than ram and if you're writing all
> > ram to disk that's also slow.
>
> > I have a machine with 256mb of ram and no disk.  It runs just fine
> > w/o swap. Only reason I even had swap here is if I ran something
> > that used up all my memory and it has happened.
>
> > Since when has linux started to be like windows "our way or no way"?
>
> I've ALWAYS said that it's a rule-of-thumb. This means that if you
> have a good argument to do it differently, you should surely do so!
>
> I maintain a 32M machine without swap. My workstation has 768Mb RAM
> and almost 2G swap:
>
> /home/wolff> free
>  total   used   free sharedbuffers cached
> Mem:770872 766724   4148  0 259816  78308
> -/+ buffers/cache: 428600 342272
> Swap:  1843212   18401841372
> /home/wolff>
>
> That disk space is just sitting there. Never to be used. I spent $400
> on the RAM, and I'm now reserving about $8 worth of disk space for
> swap. I think that the $8 is well worth it. It keeps my machine
> functional a while longer should something go haywire... As I said:
> If you don't want to see it that way: Fine with me.
>
>   Roger.
>
>
>
> --
> ** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
> *-- BitWizard writes Linux device drivers for any device you may have! --*
> * There are old pilots, and there are bold pilots.
> * There are also old, bald pilots.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread Albert D. Cahalan

Rogier Wolff writes:
> Wakko Warner wrote:

>>> So you've spent almost $200 for RAM, and refuse to spend
>>> $4 for 1Gb of swap space. Fine with me. 

So that is a factor of 50 in price. It's what, a factor of 100
in access time?

> That disk space is just sitting there. Never to be used. I spent $400
> on the RAM, and I'm now reserving about $8 worth of disk space for
> swap. I think that the $8 is well worth it. It keeps my machine
> functional a while longer should something go haywire... As I said:
> If you don't want to see it that way: Fine with me. 

It is a disaster waiting to happen. Instead of having the offending
process get killed, your machine could suffer extreme thrashing.

Have enough swap for idle processes and no more.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread J . A . Magallon


On 04.28 Rogier Wolff wrote:
> 
> I've ALWAYS said that it's a rule-of-thumb. This means that if you
> have a good argument to do it differently, you should surely do so!
> 

I'm not so sure it's only a 'rule of thumb'. Do not know the state of
paging in just released 2.4.4, but in previuos kernel, a page that was
paged-out, reserves its place in swap even if it is paged-in again, so
once you have paged-out all your ram at least once, you can't get any
more memory, even if swap is 'empty'.

Now that macs leave out that kind of swap (MacOS classic), linux takes it.
At least macos does not allow you to set vm to less than your physical mem.

-- 
J.A. Magallon  #  Let the source
mailto:[EMAIL PROTECTED]  #  be with you, Luke... 

Linux werewolf 2.4.4 #1 SMP Sat Apr 28 11:45:02 CEST 2001 i686

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread Rogier Wolff

Wakko Warner wrote:
> > So you've spent almost $200 for RAM, and refuse to spend $4 for 1Gb of
> > swap space. Fine with me. 
 
> I put this much ram into the system to keep from having swap.  I
> still say swap=2x ram is a stupid idea.  I fail to see the logic in
> that.  Disk is much much slower than ram and if you're writing all
> ram to disk that's also slow.
 
> I have a machine with 256mb of ram and no disk.  It runs just fine
> w/o swap. Only reason I even had swap here is if I ran something
> that used up all my memory and it has happened.

> Since when has linux started to be like windows "our way or no way"?

I've ALWAYS said that it's a rule-of-thumb. This means that if you
have a good argument to do it differently, you should surely do so!

I maintain a 32M machine without swap. My workstation has 768Mb RAM
and almost 2G swap:

/home/wolff> free
 total   used   free sharedbuffers cached
Mem:770872 766724   4148  0 259816  78308
-/+ buffers/cache: 428600 342272
Swap:  1843212   18401841372
/home/wolff> 

That disk space is just sitting there. Never to be used. I spent $400
on the RAM, and I'm now reserving about $8 worth of disk space for
swap. I think that the $8 is well worth it. It keeps my machine
functional a while longer should something go haywire... As I said:
If you don't want to see it that way: Fine with me. 

Roger. 



-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread Wakko Warner

> So you've spent almost $200 for RAM, and refuse to spend $4 for 1Gb of
> swap space. Fine with me. 

I put this much ram into the system to keep from having swap.  I still say
swap=2x ram is a stupid idea.  I fail to see the logic in that.  Disk is
much much slower than ram and if you're writing all ram to disk that's also
slow.

I have a machine with 256mb of ram and no disk.  It runs just fine w/o swap. 
Only reason I even had swap here is if I ran something that used up all my
memory and it has happened.

Since when has linux started to be like windows "our way or no way"?

-- 
 Lab tests show that use of micro$oft causes cancer in lab animals
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread Rogier Wolff

Wakko Warner wrote:
> > I've always been trying to convice people that 2x RAM remains a good 
> > rule-of-thumb.
> 
> IMO this is pointless
> 
>  total   used   free sharedbuffers cached
> Mem:517456 505332  12124 111016  97752 236884
> -/+ buffers/cache: 170696 346760
> Swap:   131048  23216 107832
> 
> Of course for me, I'm not about to waste 1gb of disk space for swap.
> 
> The swap I have is 2 partitions, one on each drive both with a priority of
> 0.  Personally, I like the way it's done on my box.

So you've spent almost $200 for RAM, and refuse to spend $4 for 1Gb of
swap space. Fine with me. 

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread Rogier Wolff

LA Walsh wrote:
> Rogier Wolff wrote:
> 
> > > > On Linux any swap adds to the memory pool, so 1xRAM would be
> > > > equivalent to 2xRAM with the old old OS's.
> > >
> > > no more true AFAIK
> >
> > I've always been trying to convice people that 2x RAM remains a good
> > rule-of-thumb.
> 
> ---
> Ug.  I like to view swap as "low grade memory" -- i.e. I really
> should spend 99.9% of my time in RAM -- if I spill, then it means
> I'm running too much/too big for my computer and should get more RAM --
> meanwhile, I suffer with performance degradation to remind me I'm really
> exceeding my machine's physical memory capacity.

Agreed. However, with current growing memory sizes, people are
suggesting: "I ran with 32Mb RAM and 64Mb swap until a year ago, so
128Mb ram should allow me to run swapless". I disagree.

The price-difference between RAM and disk is such (*) that if you
follow the guideline of swap=2xRAM, you're still spending 20 to 50
times as much on the RAM as you are on the disk space for swap.

Even if the swap is going to be idle 99.9% of the time, the investment
allows you to say "gosh what is the machien slow today. It might be
swapping" instead of "Why the heck has the machine crashed (#) all of
a sudden."

Rogier. 

(*) And remains like that!

(#) Even if we have a good OOM killer, you might find the machine in a
non-workable state.

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread Rogier Wolff

LA Walsh wrote:
 Rogier Wolff wrote:
 
On Linux any swap adds to the memory pool, so 1xRAM would be
equivalent to 2xRAM with the old old OS's.
  
   no more true AFAIK
 
  I've always been trying to convice people that 2x RAM remains a good
  rule-of-thumb.
 
 ---
 Ug.  I like to view swap as low grade memory -- i.e. I really
 should spend 99.9% of my time in RAM -- if I spill, then it means
 I'm running too much/too big for my computer and should get more RAM --
 meanwhile, I suffer with performance degradation to remind me I'm really
 exceeding my machine's physical memory capacity.

Agreed. However, with current growing memory sizes, people are
suggesting: I ran with 32Mb RAM and 64Mb swap until a year ago, so
128Mb ram should allow me to run swapless. I disagree.

The price-difference between RAM and disk is such (*) that if you
follow the guideline of swap=2xRAM, you're still spending 20 to 50
times as much on the RAM as you are on the disk space for swap.

Even if the swap is going to be idle 99.9% of the time, the investment
allows you to say gosh what is the machien slow today. It might be
swapping instead of Why the heck has the machine crashed (#) all of
a sudden.

Rogier. 

(*) And remains like that!

(#) Even if we have a good OOM killer, you might find the machine in a
non-workable state.

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread Wakko Warner

 So you've spent almost $200 for RAM, and refuse to spend $4 for 1Gb of
 swap space. Fine with me. 

I put this much ram into the system to keep from having swap.  I still say
swap=2x ram is a stupid idea.  I fail to see the logic in that.  Disk is
much much slower than ram and if you're writing all ram to disk that's also
slow.

I have a machine with 256mb of ram and no disk.  It runs just fine w/o swap. 
Only reason I even had swap here is if I ran something that used up all my
memory and it has happened.

Since when has linux started to be like windows our way or no way?

-- 
 Lab tests show that use of micro$oft causes cancer in lab animals
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread J . A . Magallon


On 04.28 Rogier Wolff wrote:
 
 I've ALWAYS said that it's a rule-of-thumb. This means that if you
 have a good argument to do it differently, you should surely do so!
 

I'm not so sure it's only a 'rule of thumb'. Do not know the state of
paging in just released 2.4.4, but in previuos kernel, a page that was
paged-out, reserves its place in swap even if it is paged-in again, so
once you have paged-out all your ram at least once, you can't get any
more memory, even if swap is 'empty'.

Now that macs leave out that kind of swap (MacOS classic), linux takes it.
At least macos does not allow you to set vm to less than your physical mem.

-- 
J.A. Magallon  #  Let the source
mailto:[EMAIL PROTECTED]  #  be with you, Luke... 

Linux werewolf 2.4.4 #1 SMP Sat Apr 28 11:45:02 CEST 2001 i686

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread Albert D. Cahalan

Rogier Wolff writes:
 Wakko Warner wrote:

 So you've spent almost $200 for RAM, and refuse to spend
 $4 for 1Gb of swap space. Fine with me. 

So that is a factor of 50 in price. It's what, a factor of 100
in access time?

 That disk space is just sitting there. Never to be used. I spent $400
 on the RAM, and I'm now reserving about $8 worth of disk space for
 swap. I think that the $8 is well worth it. It keeps my machine
 functional a while longer should something go haywire... As I said:
 If you don't want to see it that way: Fine with me. 

It is a disaster waiting to happen. Instead of having the offending
process get killed, your machine could suffer extreme thrashing.

Have enough swap for idle processes and no more.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread David Lang

at the low end useing a bit of disk for swap doesn't hurt, I ran into a
case a couple years ago on AIX systems. we buy them with 2G ram so that we
don't need to swap, but discovered (the hard way) that we also needed to
allocate 4G of disk space for those boxes (allocating less then 2G meant
that we couldn't use the 2G of RAM). This meant that we had to go out and
buy 2nd hard drives for every machine, just to put the swap files on.

now disks are larger today so it's not as much of an issue, but even with
modern 9-18G drives you can end up eating up 20% or more on a large
machine, this starts to be significant. you can try to say that any box
with that much ram must have lots of disk as well, and most of the time
you will be right, but not all the time. there are cases (webservers for
example) where the machines will be built with lots of RAM and CPU and
little disk becouse they get all their content and put all their logs
elsewhere on the network. in fact with the advances in flash size and the
desire to create high performance clusters, I would not be surprised to
see web node machines produced with no hard drives. it means one less
thing that can break on the system (think a rack of transmeta powered
boxes with no moving parts in the rack except possibly fans)

David Lang

 On
Sat, 28 Apr 2001 [EMAIL PROTECTED] wrote:

 Date: Sat, 28 Apr 2001 16:11:47 +0200 (MEST)
 From: [EMAIL PROTECTED]
 To: Wakko Warner [EMAIL PROTECTED]
 Cc: Rogier Wolff [EMAIL PROTECTED],
  Xavier Bestel [EMAIL PROTECTED],
  Goswin Brederlow [EMAIL PROTECTED],
  William T Wilson [EMAIL PROTECTED], [EMAIL PROTECTED],
  [EMAIL PROTECTED]
 Subject: Re: 2.4 and 2GB swap partition limit

 Wakko Warner wrote:
   So you've spent almost $200 for RAM, and refuse to spend $4 for 1Gb of
   swap space. Fine with me.

  I put this much ram into the system to keep from having swap.  I
  still say swap=2x ram is a stupid idea.  I fail to see the logic in
  that.  Disk is much much slower than ram and if you're writing all
  ram to disk that's also slow.

  I have a machine with 256mb of ram and no disk.  It runs just fine
  w/o swap. Only reason I even had swap here is if I ran something
  that used up all my memory and it has happened.

  Since when has linux started to be like windows our way or no way?

 I've ALWAYS said that it's a rule-of-thumb. This means that if you
 have a good argument to do it differently, you should surely do so!

 I maintain a 32M machine without swap. My workstation has 768Mb RAM
 and almost 2G swap:

 /home/wolff free
  total   used   free sharedbuffers cached
 Mem:770872 766724   4148  0 259816  78308
 -/+ buffers/cache: 428600 342272
 Swap:  1843212   18401841372
 /home/wolff

 That disk space is just sitting there. Never to be used. I spent $400
 on the RAM, and I'm now reserving about $8 worth of disk space for
 swap. I think that the $8 is well worth it. It keeps my machine
 functional a while longer should something go haywire... As I said:
 If you don't want to see it that way: Fine with me.

   Roger.



 --
 ** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
 *-- BitWizard writes Linux device drivers for any device you may have! --*
 * There are old pilots, and there are bold pilots.
 * There are also old, bald pilots.
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread Rogier Wolff

David Lang wrote:
 at the low end useing a bit of disk for swap doesn't hurt, I ran into a
 case a couple years ago on AIX systems. we buy them with 2G ram so that we
 don't need to swap, but discovered (the hard way) that we also needed to
 allocate 4G of disk space for those boxes (allocating less then 2G meant
 that we couldn't use the 2G of RAM). This meant that we had to go out and
 buy 2nd hard drives for every machine, just to put the swap files on.
 
 now disks are larger today so it's not as much of an issue, but even with
 modern 9-18G drives you can end up eating up 20% or more on a large
 machine, this starts to be significant. you can try to say that any box
 with that much ram must have lots of disk as well, and most of the time
 you will be right, but not all the time. there are cases (webservers for
 example) where the machines will be built with lots of RAM and CPU and
 little disk becouse they get all their content and put all their logs
 elsewhere on the network. in fact with the advances in flash size and the
 desire to create high performance clusters, I would not be surprised to
 see web node machines produced with no hard drives. it means one less
 thing that can break on the system (think a rack of transmeta powered
 boxes with no moving parts in the rack except possibly fans)

On Sat, 28 Apr 2001 [EMAIL PROTECTED] wrote:

  I've ALWAYS said that it's a rule-of-thumb. This means that if you
  have a good argument to do it differently, you should surely do so!

Roger. 


-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-28 Thread Jonathan Lundell

At 2:04 PM -0400 2001-04-28, Albert D. Cahalan wrote:
It is a disaster waiting to happen. Instead of having the offending
process get killed, your machine could suffer extreme thrashing.

Have enough swap for idle processes and no more.

Let's altogether now say working set.

(Does Linux swap out text, by the way, he asks ignorantly?)
-- 
/Jonathan Lundell.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread LA Walsh

Rik van Riel wrote:

> On Fri, 27 Apr 2001, LA Walsh wrote:
>
> > An interesting option (though with less-than-stellar performance
> > characteristics) would be a dynamically expanding swapfile.  If you're
> > going to be hit with swap penalties, it may be useful to not have to
> > pre-reserve something you only hit once in a great while.
>
> This makes amazingly little sense since you'd still need to
> pre-reserve the disk space the swapfile grows into.

---
Why?  Why not have a zero length file that you grow only if you spill?
If you can't spill, you are out of memory -- or reserve a 'safety'
margin ahead -- like reserve 32k at a time and grow it.  It may make
little sense, but I believe it is what is used on pseudo OS's
like Windows -- you *can* preallocate, but the normal case has
Windows managing the swap file and growing it as needed up to
available disk space.  If it is doable in windows, you'd think there'd
be some way of doing it in Linux, but perhaps linux's complexity
doesn't allow for that type of feature.

As for disk-space reserves, if you have 5% reserved for
root' on a 20G ext disk, that still amounts to 1G reserved for root.
Seems an automatically sizing swap file might be just fine for some people
not me, I don't even like to use swap, but I'm not my mom using windows ME either).

But, conversely, if it's coming out of space I wouldn't normally
use anyway -- say the "5%" -- i.e. the 5% is something I'd likely only
use under *rare* conditions.  I might have enough memory and the
right system load that I also 'rarely' use swap -- so not reserving
1G/1G (2xMEM) on my laptop both of which will rarely get used seems like
a waste of 2G.  I suppose if I put it that way I might convince myself
to use it,

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread Hugh Dickins

On Fri, 27 Apr 2001, Rik van Riel wrote:
> On Fri, 27 Apr 2001, LA Walsh wrote:
> 
> > An interesting option (though with less-than-stellar performance
> > characteristics) would be a dynamically expanding swapfile.  If you're
> > going to be hit with swap penalties, it may be useful to not have to
> > pre-reserve something you only hit once in a great while.
> 
> This makes amazingly little sense since you'd still need to
> pre-reserve the disk space the swapfile grows into.

It makes roughly the same sense as over-committing memory.
Both are useful, both are unreliable.

Hugh

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread Rik van Riel

On Fri, 27 Apr 2001, Hugh Dickins wrote:
> On Fri, 27 Apr 2001, Rik van Riel wrote:
> > On Fri, 27 Apr 2001, LA Walsh wrote:
> >
> > > An interesting option (though with less-than-stellar performance
> > > characteristics) would be a dynamically expanding swapfile.  If you're
> > > going to be hit with swap penalties, it may be useful to not have to
> > > pre-reserve something you only hit once in a great while.
> >
> > This makes amazingly little sense since you'd still need to
> > pre-reserve the disk space the swapfile grows into.
>
> It makes roughly the same sense as over-committing memory.
> Both are useful, both are unreliable.

True, agreed.

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread Wakko Warner

> I've always been trying to convice people that 2x RAM remains a good 
> rule-of-thumb.

IMO this is pointless

 total   used   free sharedbuffers cached
Mem:517456 505332  12124 111016  97752 236884
-/+ buffers/cache: 170696 346760
Swap:   131048  23216 107832

Of course for me, I'm not about to waste 1gb of disk space for swap.

The swap I have is 2 partitions, one on each drive both with a priority of
0.  Personally, I like the way it's done on my box.

-- 
 Lab tests show that use of micro$oft causes cancer in lab animals
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread Thomas Dodd

Rik van Riel wrote:
> 
> On Fri, 27 Apr 2001, LA Walsh wrote:
> 
> > An interesting option (though with less-than-stellar performance
> > characteristics) would be a dynamically expanding swapfile.  If you're
> > going to be hit with swap penalties, it may be useful to not have to
> > pre-reserve something you only hit once in a great while.
> 
> This makes amazingly little sense since you'd still need to
> pre-reserve the disk space the swapfile grows into.
> 
> A dynamically growing swap file can only save you if you
> reserve enough free space on your filesystem for the thing
> to grow...

I seams to work for M$, not that they are a good example.

But in /proc/sys/vm or /proc/sys/swapfile having
min_swap, max_swap, min_free_space and the filename to use. This could
then be set by init scripts like sysctl.

It never grows larger than max(swapfile.max_swap, free_space -
min_free_space).
so if you have free space on the filesystem it can be used,
but if you don't have space the current behavior remains.

Sure it would be slow, but that would only be a problem if you
run out of swap space and need to allocate more. Any
time this routine allocates a larger file than syapfile.min_swap
or frees space you send a WARN message.

Now the user will know why the performance dropped
and can either add RAM, or increase swap with by
partition, file, or increase swapfile.min_swap.

Those with enough RAM/swap will never even know it's there.

-Thomas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread Rik van Riel

On Fri, 27 Apr 2001, LA Walsh wrote:

> An interesting option (though with less-than-stellar performance
> characteristics) would be a dynamically expanding swapfile.  If you're
> going to be hit with swap penalties, it may be useful to not have to
> pre-reserve something you only hit once in a great while.

This makes amazingly little sense since you'd still need to
pre-reserve the disk space the swapfile grows into.

A dynamically growing swap file can only save you if you
reserve enough free space on your filesystem for the thing
to grow...

regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread LA Walsh

Rogier Wolff wrote:

> > > On Linux any swap adds to the memory pool, so 1xRAM would be
> > > equivalent to 2xRAM with the old old OS's.
> >
> > no more true AFAIK
>
> I've always been trying to convice people that 2x RAM remains a good
> rule-of-thumb.

---
Ug.  I like to view swap as "low grade memory" -- i.e. I really
should spend 99.9% of my time in RAM -- if I spill, then it means
I'm running too much/too big for my computer and should get more RAM --
meanwhile, I suffer with performance degradation to remind me I'm really
exceeding my machine's physical memory capacity.

An interesting option (though with less-than-stellar performance
characteristics) would be a dynamically expanding swapfile.  If you're
going to be hit with swap penalties, it may be useful to not have to
pre-reserve something you only hit once in a great while.

Definitely only for systems where you don't expect to use swap (but
it could be there for "emergencies" up to some predefined limit or
available disk space).

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread Rogier Wolff

Xavier Bestel wrote:
[Charset ISO-8859-1 unsupported, filtering to ASCII...]
> Le 08 Mar 2001 14:05:25 +0100, Goswin Brederlow a _crit :
> 
> > I believe the 2xRAM rule comes from the OS's where ram was only buffer
> > for the swap. So with 1xRAM you had a running system with 1xRAM
> > memory, so nothing is gained by that much swap.
> 
> I think kernels 2.4.x came back to this behavior.
>  
> > On Linux any swap adds to the memory pool, so 1xRAM would be
> > equivalent to 2xRAM with the old old OS's.
> 
> no more true AFAIK

I've always been trying to convice people that 2x RAM remains a good 
rule-of-thumb.

Rogier. 



-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread Xavier Bestel

Le 08 Mar 2001 14:05:25 +0100, Goswin Brederlow a écrit :

> I believe the 2xRAM rule comes from the OS's where ram was only buffer
> for the swap. So with 1xRAM you had a running system with 1xRAM
> memory, so nothing is gained by that much swap.

I think kernels 2.4.x came back to this behavior.
 
> On Linux any swap adds to the memory pool, so 1xRAM would be
> equivalent to 2xRAM with the old old OS's.

no more true AFAIK

Xav

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread Goswin Brederlow

> " " == Rogier Wolff <[EMAIL PROTECTED]> writes:

 > William T Wilson wrote:
>> On Fri, 2 Mar 2001 [EMAIL PROTECTED] wrote:
>> 
>> > Linus has spoken, and 2.4.x now requires swap = 2x RAM.
>> 
>> I think I missed this.  What possible value does this have?
>> (Not even Sun, the original purveyors of the 2x RAM rule, need
>> this any more).

 > RAM is still about 100x more expensive than HD. So I always
 > recommend you use about 2% of the money you spent on RAM to pay
 > for the HD space to handle swap.

 > Actually the deal is: either use enough swap (about 2x RAM) or
 > use none at all.

I believe the 2xRAM rule comes from the OS's where ram was only buffer
for the swap. So with 1xRAM you had a running system with 1xRAM
memory, so nothing is gained by that much swap.

On Linux any swap adds to the memory pool, so 1xRAM would be
equivalent to 2xRAM with the old old OS's.

Generally I would say that you need so much swap that you mouse does
not stop when its used. That usually means so much swap as you can
read/write within 10-30 seconds when done continuous. If you need more
swap the system becomes unuseable and performance goes realy down.

But thats all just my liking. Some applications can use 10xRAM swap
and still not stop the mouse from working, some applications do that
with 100 MB swap. I would keep swap moderate (one 2GB partition should
be enough for any normal system) and use swapfiles in case more is
needed temporary.

MfG
Goswin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread Goswin Brederlow

   == Rogier Wolff [EMAIL PROTECTED] writes:

  William T Wilson wrote:
 On Fri, 2 Mar 2001 [EMAIL PROTECTED] wrote:
 
  Linus has spoken, and 2.4.x now requires swap = 2x RAM.
 
 I think I missed this.  What possible value does this have?
 (Not even Sun, the original purveyors of the 2x RAM rule, need
 this any more).

  RAM is still about 100x more expensive than HD. So I always
  recommend you use about 2% of the money you spent on RAM to pay
  for the HD space to handle swap.

  Actually the deal is: either use enough swap (about 2x RAM) or
  use none at all.

I believe the 2xRAM rule comes from the OS's where ram was only buffer
for the swap. So with 1xRAM you had a running system with 1xRAM
memory, so nothing is gained by that much swap.

On Linux any swap adds to the memory pool, so 1xRAM would be
equivalent to 2xRAM with the old old OS's.

Generally I would say that you need so much swap that you mouse does
not stop when its used. That usually means so much swap as you can
read/write within 10-30 seconds when done continuous. If you need more
swap the system becomes unuseable and performance goes realy down.

But thats all just my liking. Some applications can use 10xRAM swap
and still not stop the mouse from working, some applications do that
with 100 MB swap. I would keep swap moderate (one 2GB partition should
be enough for any normal system) and use swapfiles in case more is
needed temporary.

MfG
Goswin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread Xavier Bestel

Le 08 Mar 2001 14:05:25 +0100, Goswin Brederlow a écrit :

 I believe the 2xRAM rule comes from the OS's where ram was only buffer
 for the swap. So with 1xRAM you had a running system with 1xRAM
 memory, so nothing is gained by that much swap.

I think kernels 2.4.x came back to this behavior.
 
 On Linux any swap adds to the memory pool, so 1xRAM would be
 equivalent to 2xRAM with the old old OS's.

no more true AFAIK

Xav

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread Rogier Wolff

Xavier Bestel wrote:
[Charset ISO-8859-1 unsupported, filtering to ASCII...]
 Le 08 Mar 2001 14:05:25 +0100, Goswin Brederlow a _crit :
 
  I believe the 2xRAM rule comes from the OS's where ram was only buffer
  for the swap. So with 1xRAM you had a running system with 1xRAM
  memory, so nothing is gained by that much swap.
 
 I think kernels 2.4.x came back to this behavior.
  
  On Linux any swap adds to the memory pool, so 1xRAM would be
  equivalent to 2xRAM with the old old OS's.
 
 no more true AFAIK

I've always been trying to convice people that 2x RAM remains a good 
rule-of-thumb.

Rogier. 



-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread LA Walsh

Rogier Wolff wrote:

   On Linux any swap adds to the memory pool, so 1xRAM would be
   equivalent to 2xRAM with the old old OS's.
 
  no more true AFAIK

 I've always been trying to convice people that 2x RAM remains a good
 rule-of-thumb.

---
Ug.  I like to view swap as low grade memory -- i.e. I really
should spend 99.9% of my time in RAM -- if I spill, then it means
I'm running too much/too big for my computer and should get more RAM --
meanwhile, I suffer with performance degradation to remind me I'm really
exceeding my machine's physical memory capacity.

An interesting option (though with less-than-stellar performance
characteristics) would be a dynamically expanding swapfile.  If you're
going to be hit with swap penalties, it may be useful to not have to
pre-reserve something you only hit once in a great while.

Definitely only for systems where you don't expect to use swap (but
it could be there for emergencies up to some predefined limit or
available disk space).

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread Rik van Riel

On Fri, 27 Apr 2001, LA Walsh wrote:

 An interesting option (though with less-than-stellar performance
 characteristics) would be a dynamically expanding swapfile.  If you're
 going to be hit with swap penalties, it may be useful to not have to
 pre-reserve something you only hit once in a great while.

This makes amazingly little sense since you'd still need to
pre-reserve the disk space the swapfile grows into.

A dynamically growing swap file can only save you if you
reserve enough free space on your filesystem for the thing
to grow...

regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread Thomas Dodd

Rik van Riel wrote:
 
 On Fri, 27 Apr 2001, LA Walsh wrote:
 
  An interesting option (though with less-than-stellar performance
  characteristics) would be a dynamically expanding swapfile.  If you're
  going to be hit with swap penalties, it may be useful to not have to
  pre-reserve something you only hit once in a great while.
 
 This makes amazingly little sense since you'd still need to
 pre-reserve the disk space the swapfile grows into.
 
 A dynamically growing swap file can only save you if you
 reserve enough free space on your filesystem for the thing
 to grow...

I seams to work for M$, not that they are a good example.

But in /proc/sys/vm or /proc/sys/swapfile having
min_swap, max_swap, min_free_space and the filename to use. This could
then be set by init scripts like sysctl.

It never grows larger than max(swapfile.max_swap, free_space -
min_free_space).
so if you have free space on the filesystem it can be used,
but if you don't have space the current behavior remains.

Sure it would be slow, but that would only be a problem if you
run out of swap space and need to allocate more. Any
time this routine allocates a larger file than syapfile.min_swap
or frees space you send a WARN message.

Now the user will know why the performance dropped
and can either add RAM, or increase swap with by
partition, file, or increase swapfile.min_swap.

Those with enough RAM/swap will never even know it's there.

-Thomas
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread Wakko Warner

 I've always been trying to convice people that 2x RAM remains a good 
 rule-of-thumb.

IMO this is pointless

 total   used   free sharedbuffers cached
Mem:517456 505332  12124 111016  97752 236884
-/+ buffers/cache: 170696 346760
Swap:   131048  23216 107832

Of course for me, I'm not about to waste 1gb of disk space for swap.

The swap I have is 2 partitions, one on each drive both with a priority of
0.  Personally, I like the way it's done on my box.

-- 
 Lab tests show that use of micro$oft causes cancer in lab animals
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread LA Walsh

Rik van Riel wrote:

 On Fri, 27 Apr 2001, LA Walsh wrote:

  An interesting option (though with less-than-stellar performance
  characteristics) would be a dynamically expanding swapfile.  If you're
  going to be hit with swap penalties, it may be useful to not have to
  pre-reserve something you only hit once in a great while.

 This makes amazingly little sense since you'd still need to
 pre-reserve the disk space the swapfile grows into.

---
Why?  Why not have a zero length file that you grow only if you spill?
If you can't spill, you are out of memory -- or reserve a 'safety'
margin ahead -- like reserve 32k at a time and grow it.  It may make
little sense, but I believe it is what is used on pseudo OS's
like Windows -- you *can* preallocate, but the normal case has
Windows managing the swap file and growing it as needed up to
available disk space.  If it is doable in windows, you'd think there'd
be some way of doing it in Linux, but perhaps linux's complexity
doesn't allow for that type of feature.

As for disk-space reserves, if you have 5% reserved for
root' on a 20G ext disk, that still amounts to 1G reserved for root.
Seems an automatically sizing swap file might be just fine for some people
not me, I don't even like to use swap, but I'm not my mom using windows ME either).

But, conversely, if it's coming out of space I wouldn't normally
use anyway -- say the 5% -- i.e. the 5% is something I'd likely only
use under *rare* conditions.  I might have enough memory and the
right system load that I also 'rarely' use swap -- so not reserving
1G/1G (2xMEM) on my laptop both of which will rarely get used seems like
a waste of 2G.  I suppose if I put it that way I might convince myself
to use it,

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-03-05 Thread Andries . Brouwer

>> For 2.5 we could perhaps think about a new swapfile layout

> The format seems to be just fine.

No, the present definition is terrible.

Read the mkswap source. A forest of #ifdefs,
and still sometimes user assistance is required
because mkswap cannot always figure out what the "pagesize" is.

There are two main problems:
(i) "new" swap is hardly larger than "old" swap
(ii) the unit in which new swap is measured is a mystery

So, the next swap space has (i) a signature "SWAPSPACE3",
(ii) (not strictly necessary) a size given as a 64-bit number in bytes.
Moreover, the swapon call must not refuse swapspaces
that are larger than the kernel can handle.

Andries
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-03-05 Thread Rik van Riel

On 5 Mar 2001, Christoph Rohland wrote:

> BTW often these big servers run databases and application servers
> which have most of their memory in shared memory. Shared memory does
> free the swap entries on swapin. (I thought about changing that but as
> long as we have no garbage collection for idle swap entries I will not
> do it)

It's on my infinite TODO list ;)

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-03-05 Thread Rik van Riel

On 5 Mar 2001, Christoph Rohland wrote:

> > We've also seen (anecdotal evidence here) cases where a kernel
> > panics, which we believe may have to do with having 0 < swap < 2x
> > RAM.  We're investigating further.
> 
> That would be a kernel bug which should be fixed. The kernel should
> handle oom/oos.

There was a bug which made the OOM-killer not work for some
workloads. It should be fixed in the latest -ac kernels...

If in doubt, please test ;)

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-03-05 Thread Matti Aarnio

On Mon, Mar 05, 2001 at 09:58:00AM +0100, Christoph Rohland wrote:
> Hi Matt,
> 
> On Sun, 4 Mar 2001, Matt Domsch wrote:
> > My concern is that if there continues to be a 2GB swap
> > partition/file size limitation, and you can have (as currently
> > #defined) 8 swap partitions, you're limited to 16GB swap, which then
> > follows a max of 8GB RAM.  We'd like to sell servers with 32GB or
> > 64GB RAM to customers who request such for their applications.  Such
> > customers generally have no problem purchasing additional disks to
> > be used for swap, likely on a hardware RAID controller.
> 
> I did think about that too and I also think the 2GB limit is not
> appropriate for the big servers. But I do not beleive that you need so
> much swap on these machines. If you drive a 32 GB machine so heavily
> into swap it is more busy finding the pages to swap than doing
> anything really interesting. (At least that's my experience)

While 2.4 did grow the maximum filesizes to CACHEPAGESIZE*2G
bytes by scaling pagecache contained file page offsets with
the CACHEPAGESIZE -- which may or may not be the same as the
machine PAGE_SIZE.  It can be larger, though propably not smaller.

For SWAP uses there is also similar 32 bit quantity (actually
it is "unsigned long"), however some bits in the the swp_entry_t
are being used to something usefull in the cache logic besides
of the offset inside the file.

Indeed looking at the source,  include/asm-i386/pgtable.h
defines following:

#define SWP_TYPE(x) (((x).val >> 1) & 0x3f)
#define SWP_OFFSET(x)   ((x).val >> 8)
#define SWP_ENTRY(type, offset) ((swp_entry_t){ ((type) << 1)|((offset) << 8) })

The i386 actually support up to 4*16 = 64 swap files (or partitions)
with this SWP_TYPE() definition, while  include/linux/swap.h does
define  MAX_SWAPFILES  to be 8 ...  If that were a pointer array
to kmalloc()ed blocks, the limit could be much higher.  Indeed
I think this is the only *static* limit anywhere in the current
swap code.

Similarly it supports 2^24 PAGES of swap at i386 per file/partition.
( 16 million pages of 4k each = 64 GB -- should be enough ;) )
( That would require vmalloc() to allocate 32 MB block, though.
  That might not be possible at every occasion -> swapon may fail. )

The more I read the documentation (= source and its comments),
the more I am inclined to think that the beast *will* work with
swap-partitions (and files!) larger than 2G.

Stephen Tweedie did this 'SWAPSPACE2' work for 2.4 series, what
he might tell ?   Is it really just a matter of fixing the
mkswap
utility ?  Was Stephen just conservative saying:
"Don't go over 2G" (I haven't tested it)


Reviewing thru the architectural definitions of these SWP_***()
macroes, the shifts used for SWP_OFFSET seem to vary in between
7-12 and for Alpha and MIPS64: 40.  Indeed things are not very
easy to understand with 64 bit architectures.  It looks like
those architectures use the low 32 bits of  swp_entry_t  for
something, while most use at most couple of bits.

Oh, even those 64-bit system seem to give at least 24 bits for
PAGE_SIZE 'offset'.  The lowest bitcount for 'offset' seems to
be at s390 which gives "only" 2^20 * 4k pages, or 4 GB per
swap file/partition. (SWP_OFFSET() shifts with 12, which is
same as PAGE_SHIFT for the machine.  Why SPARC64 uses PAGE_SHIFT
in its own unique way, that I don't know.)

Somehow I suspect that the makers of each architecture port have
not quite understood what the swp_entry_t bits are used for, and
have blindly presumed them to be related to PAGE_SIZE ...

> For 2.5 we could perhaps think about a new swapfile layout which
> allows bigger partitions.

The format seems to be just fine.

> Greetings
>   Christoph

/Matti Aarnio
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-03-05 Thread Christoph Hellwig

Hi Matt,

In article <[EMAIL PROTECTED]> you wrote:
> My concern is that if there continues to be a 2GB swap partition/file size
> limitation, and you can have (as currently #defined) 8 swap partitions,
> you're limited to 16GB swap, which then follows a max of 8GB RAM.  We'd like
> to sell servers with 32GB or 64GB RAM to customers who request such for
> their applications.  Such customers generally have no problem purchasing
> additional disks to be used for swap, likely on a hardware RAID controller.

dou you actually want to page that high memory?  These high memory
configurations are usually used by databases and other huge applications
that have their own memory management.

Other UNIX versions, e.g. the UnixWare with dshm have implemented special
memory pools (in this case dshm) to give unswappable memory to this
applications.  While I don't like such implementations with fixed memory
pools it might be a good idea to have a MAP_DEDICATED flags to mmap
(/dev/zero) to allocate non-aged and thus non-paged memory.

To not make the system unusable by allocating too much of this memory
allocation might need a special capability and/or a dynamic limit of
allocatable unaged memory (sysctl?).

Christoph

-- 
Of course it doesn't work. We've performed a software upgrade.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-03-05 Thread Christoph Rohland

Hi Matt,

On Sun, 4 Mar 2001, Matt Domsch wrote:
> My concern is that if there continues to be a 2GB swap
> partition/file size limitation, and you can have (as currently
> #defined) 8 swap partitions, you're limited to 16GB swap, which then
> follows a max of 8GB RAM.  We'd like to sell servers with 32GB or
> 64GB RAM to customers who request such for their applications.  Such
> customers generally have no problem purchasing additional disks to
> be used for swap, likely on a hardware RAID controller.

I did think about that too and I also think the 2GB limit is not
appropriate for the big servers. But I do not beleive that you need so
much swap on these machines. If you drive a 32 GB machine so heavily
into swap it is more busy finding the pages to swap than doing
anything really interesting. (At least that's my experience)

BTW often these big servers run databases and application servers
which have most of their memory in shared memory. Shared memory does
free the swap entries on swapin. (I thought about changing that but as
long as we have no garbage collection for idle swap entries I will not
do it)

On any loaded server you have to check the swap space requirements
regularly and adjust to your needs. But to setup more than let's
say 8GB swap is a waste of resource IMHO.

> We've also seen (anecdotal evidence here) cases where a kernel
> panics, which we believe may have to do with having 0 < swap < 2x
> RAM.  We're investigating further.

That would be a kernel bug which should be fixed. The kernel should
handle oom/oos.

>> Actually the deal is: either use enough swap (about 2x RAM) or use
>> none at all. 
> 
> If swap space isn't required in all cases, great!  We'll encourage
> the use of swap files as needed, rather than swap partitions.  But,
> if instead you *require* swap = 2x RAM, then the 2GB swap size
> limitation must go.

No it is not strictly required.

But still the 2GB limit is annoying and together with the
arch-independent maximum number of swap partitions/files it is pretty
dumb. 

So I would propose to first make a small patch to make MAX_SWAPFILES
arch-dependent and bigger. (x86 would allow a muc higher
MAX_SWAPFILES)

For 2.5 we could perhaps think about a new swapfile layout which
allows bigger partitions.

Greetings
Christoph


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-03-05 Thread Christoph Rohland

Hi Matt,

On Sun, 4 Mar 2001, Matt Domsch wrote:
 My concern is that if there continues to be a 2GB swap
 partition/file size limitation, and you can have (as currently
 #defined) 8 swap partitions, you're limited to 16GB swap, which then
 follows a max of 8GB RAM.  We'd like to sell servers with 32GB or
 64GB RAM to customers who request such for their applications.  Such
 customers generally have no problem purchasing additional disks to
 be used for swap, likely on a hardware RAID controller.

I did think about that too and I also think the 2GB limit is not
appropriate for the big servers. But I do not beleive that you need so
much swap on these machines. If you drive a 32 GB machine so heavily
into swap it is more busy finding the pages to swap than doing
anything really interesting. (At least that's my experience)

BTW often these big servers run databases and application servers
which have most of their memory in shared memory. Shared memory does
free the swap entries on swapin. (I thought about changing that but as
long as we have no garbage collection for idle swap entries I will not
do it)

On any loaded server you have to check the swap space requirements
regularly and adjust to your needs. But to setup more than let's
say 8GB swap is a waste of resource IMHO.

 We've also seen (anecdotal evidence here) cases where a kernel
 panics, which we believe may have to do with having 0  swap  2x
 RAM.  We're investigating further.

That would be a kernel bug which should be fixed. The kernel should
handle oom/oos.

 Actually the deal is: either use enough swap (about 2x RAM) or use
 none at all. 
 
 If swap space isn't required in all cases, great!  We'll encourage
 the use of swap files as needed, rather than swap partitions.  But,
 if instead you *require* swap = 2x RAM, then the 2GB swap size
 limitation must go.

No it is not strictly required.

But still the 2GB limit is annoying and together with the
arch-independent maximum number of swap partitions/files it is pretty
dumb. 

So I would propose to first make a small patch to make MAX_SWAPFILES
arch-dependent and bigger. (x86 would allow a muc higher
MAX_SWAPFILES)

For 2.5 we could perhaps think about a new swapfile layout which
allows bigger partitions.

Greetings
Christoph


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-03-05 Thread Christoph Hellwig

Hi Matt,

In article [EMAIL PROTECTED] you wrote:
 My concern is that if there continues to be a 2GB swap partition/file size
 limitation, and you can have (as currently #defined) 8 swap partitions,
 you're limited to 16GB swap, which then follows a max of 8GB RAM.  We'd like
 to sell servers with 32GB or 64GB RAM to customers who request such for
 their applications.  Such customers generally have no problem purchasing
 additional disks to be used for swap, likely on a hardware RAID controller.

dou you actually want to page that high memory?  These high memory
configurations are usually used by databases and other huge applications
that have their own memory management.

Other UNIX versions, e.g. the UnixWare with dshm have implemented special
memory pools (in this case dshm) to give unswappable memory to this
applications.  While I don't like such implementations with fixed memory
pools it might be a good idea to have a MAP_DEDICATED flags to mmap
(/dev/zero) to allocate non-aged and thus non-paged memory.

To not make the system unusable by allocating too much of this memory
allocation might need a special capability and/or a dynamic limit of
allocatable unaged memory (sysctl?).

Christoph

-- 
Of course it doesn't work. We've performed a software upgrade.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-03-05 Thread Matti Aarnio

On Mon, Mar 05, 2001 at 09:58:00AM +0100, Christoph Rohland wrote:
 Hi Matt,
 
 On Sun, 4 Mar 2001, Matt Domsch wrote:
  My concern is that if there continues to be a 2GB swap
  partition/file size limitation, and you can have (as currently
  #defined) 8 swap partitions, you're limited to 16GB swap, which then
  follows a max of 8GB RAM.  We'd like to sell servers with 32GB or
  64GB RAM to customers who request such for their applications.  Such
  customers generally have no problem purchasing additional disks to
  be used for swap, likely on a hardware RAID controller.
 
 I did think about that too and I also think the 2GB limit is not
 appropriate for the big servers. But I do not beleive that you need so
 much swap on these machines. If you drive a 32 GB machine so heavily
 into swap it is more busy finding the pages to swap than doing
 anything really interesting. (At least that's my experience)

While 2.4 did grow the maximum filesizes to CACHEPAGESIZE*2G
bytes by scaling pagecache contained file page offsets with
the CACHEPAGESIZE -- which may or may not be the same as the
machine PAGE_SIZE.  It can be larger, though propably not smaller.

For SWAP uses there is also similar 32 bit quantity (actually
it is "unsigned long"), however some bits in the the swp_entry_t
are being used to something usefull in the cache logic besides
of the offset inside the file.

Indeed looking at the source,  include/asm-i386/pgtable.h
defines following:

#define SWP_TYPE(x) (((x).val  1)  0x3f)
#define SWP_OFFSET(x)   ((x).val  8)
#define SWP_ENTRY(type, offset) ((swp_entry_t){ ((type)  1)|((offset)  8) })

The i386 actually support up to 4*16 = 64 swap files (or partitions)
with this SWP_TYPE() definition, while  include/linux/swap.h does
define  MAX_SWAPFILES  to be 8 ...  If that were a pointer array
to kmalloc()ed blocks, the limit could be much higher.  Indeed
I think this is the only *static* limit anywhere in the current
swap code.

Similarly it supports 2^24 PAGES of swap at i386 per file/partition.
( 16 million pages of 4k each = 64 GB -- should be enough ;) )
( That would require vmalloc() to allocate 32 MB block, though.
  That might not be possible at every occasion - swapon may fail. )

The more I read the documentation (= source and its comments),
the more I am inclined to think that the beast *will* work with
swap-partitions (and files!) larger than 2G.

Stephen Tweedie did this 'SWAPSPACE2' work for 2.4 series, what
he might tell ?   Is it really just a matter of fixing the
mkswap
utility ?  Was Stephen just conservative saying:
"Don't go over 2G" (I haven't tested it)


Reviewing thru the architectural definitions of these SWP_***()
macroes, the shifts used for SWP_OFFSET seem to vary in between
7-12 and for Alpha and MIPS64: 40.  Indeed things are not very
easy to understand with 64 bit architectures.  It looks like
those architectures use the low 32 bits of  swp_entry_t  for
something, while most use at most couple of bits.

Oh, even those 64-bit system seem to give at least 24 bits for
PAGE_SIZE 'offset'.  The lowest bitcount for 'offset' seems to
be at s390 which gives "only" 2^20 * 4k pages, or 4 GB per
swap file/partition. (SWP_OFFSET() shifts with 12, which is
same as PAGE_SHIFT for the machine.  Why SPARC64 uses PAGE_SHIFT
in its own unique way, that I don't know.)

Somehow I suspect that the makers of each architecture port have
not quite understood what the swp_entry_t bits are used for, and
have blindly presumed them to be related to PAGE_SIZE ...

 For 2.5 we could perhaps think about a new swapfile layout which
 allows bigger partitions.

The format seems to be just fine.

 Greetings
   Christoph

/Matti Aarnio
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-03-05 Thread Rik van Riel

On 5 Mar 2001, Christoph Rohland wrote:

  We've also seen (anecdotal evidence here) cases where a kernel
  panics, which we believe may have to do with having 0  swap  2x
  RAM.  We're investigating further.
 
 That would be a kernel bug which should be fixed. The kernel should
 handle oom/oos.

There was a bug which made the OOM-killer not work for some
workloads. It should be fixed in the latest -ac kernels...

If in doubt, please test ;)

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-03-05 Thread Rik van Riel

On 5 Mar 2001, Christoph Rohland wrote:

 BTW often these big servers run databases and application servers
 which have most of their memory in shared memory. Shared memory does
 free the swap entries on swapin. (I thought about changing that but as
 long as we have no garbage collection for idle swap entries I will not
 do it)

It's on my infinite TODO list ;)

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-03-05 Thread Andries . Brouwer

 For 2.5 we could perhaps think about a new swapfile layout

 The format seems to be just fine.

No, the present definition is terrible.

Read the mkswap source. A forest of #ifdefs,
and still sometimes user assistance is required
because mkswap cannot always figure out what the "pagesize" is.

There are two main problems:
(i) "new" swap is hardly larger than "old" swap
(ii) the unit in which new swap is measured is a mystery

So, the next swap space has (i) a signature "SWAPSPACE3",
(ii) (not strictly necessary) a size given as a 64-bit number in bytes.
Moreover, the swapon call must not refuse swapspaces
that are larger than the kernel can handle.

Andries
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: 2.4 and 2GB swap partition limit

2001-03-04 Thread Matt_Domsch

> > > Linus has spoken, and 2.4.x now requires swap = 2x RAM.
> > 
> > I think I missed this.  What possible value does this have? 

A good write-up of the discussion can be found at:
http://kt.zork.net/kernel-traffic/kt20010126_104.html#2


My concern is that if there continues to be a 2GB swap partition/file size
limitation, and you can have (as currently #defined) 8 swap partitions,
you're limited to 16GB swap, which then follows a max of 8GB RAM.  We'd like
to sell servers with 32GB or 64GB RAM to customers who request such for
their applications.  Such customers generally have no problem purchasing
additional disks to be used for swap, likely on a hardware RAID controller.

We've also seen (anecdotal evidence here) cases where a kernel panics, which
we believe may have to do with having 0 < swap < 2x RAM.  We're
investigating further.

> Actually the deal is: either use enough swap (about 2x RAM) or use
> none at all. 

If swap space isn't required in all cases, great!  We'll encourage the use
of swap files as needed, rather than swap partitions.  But, if instead you
*require* swap = 2x RAM, then the 2GB swap size limitation must go.

Thanks,
Matt


-- 
Matt Domsch
Dell Linux Systems Group
Linux OS Development
www.dell.com/linux
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



  1   2   >