Re: Getting rid of SHMMAX/SHMALL ?

2005-08-07 Thread Alan Cox
On Iau, 2005-08-04 at 15:48 +0100, Hugh Dickins wrote:
> On Thu, 4 Aug 2005, Matti Aarnio wrote:
> > 
> > SHM resources are non-swappable, thus I would not by default
> > let user programs go and allocate very much SHM spaces at all.
> 
> No, SHM resources are swappable.

Large limits as oracle needs still allows any user to clog up the box
completely. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-07 Thread Alan Cox
On Iau, 2005-08-04 at 15:48 +0100, Hugh Dickins wrote:
 On Thu, 4 Aug 2005, Matti Aarnio wrote:
  
  SHM resources are non-swappable, thus I would not by default
  let user programs go and allocate very much SHM spaces at all.
 
 No, SHM resources are swappable.

Large limits as oracle needs still allows any user to clog up the box
completely. 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Chen, Kenneth W
Andi Kleen wrote on Thursday, August 04, 2005 3:54 PM
> > This might be too low on large system.  We usually stress shm pretty hard
> > for db application and usually use more than 87% of total memory in just
> > one shm segment.  So I prefer either no limit or a tunable.
> 
> With large system you mean >32GB right?

Yes, between 32 GB - 128 GB.  On larger numa box in the 256 GB and upward,
we have to break shm segment into one per-numa-node and then the limit
should be OK.  I was concerned with SMP box with large memory.

> I think on a large systems some tuning is reasonable because they likely
> have trained admins. I'm more worried on reasonable defaults for the
> class of systems with 0-4GB

Sounds reasonable to me.

- Ken

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Chen, Kenneth W
Andi Kleen wrote on Thursday, August 04, 2005 6:24 AM
> I think we should just get rid of the per process limit and keep
> the global limit, but make it auto tuning based on available memory.
> That is still not very nice because that would likely keep it < available 
> memory/2, but I suspect databases usually want more than that. So
> I would even make it bigger than tmpfs for reasonably big machines.
> Let's say
> 
> if (main memory >= 1GB)
>   maxmem = main memory - main memory/8 

This might be too low on large system.  We usually stress shm pretty hard
for db application and usually use more than 87% of total memory in just
one shm segment.  So I prefer either no limit or a tunable.

- Ken

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Andi Kleen
On Thu, Aug 04, 2005 at 03:49:37PM -0700, Chen, Kenneth W wrote:
> Andi Kleen wrote on Thursday, August 04, 2005 6:24 AM
> > I think we should just get rid of the per process limit and keep
> > the global limit, but make it auto tuning based on available memory.
> > That is still not very nice because that would likely keep it < available 
> > memory/2, but I suspect databases usually want more than that. So
> > I would even make it bigger than tmpfs for reasonably big machines.
> > Let's say
> > 
> > if (main memory >= 1GB)
> > maxmem = main memory - main memory/8 
> 
> This might be too low on large system.  We usually stress shm pretty hard
> for db application and usually use more than 87% of total memory in just
> one shm segment.  So I prefer either no limit or a tunable.

With large system you mean >32GB right?

I think on a large systems some tuning is reasonable because they likely
have trained admins. I'm more worried on reasonable defaults for the
class of systems with 0-4GB

The /8 was to account for the overhead of page tables and mem_map and
leave some other memory for the system, but you're right it might be less 
with hugetlbfs.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Andi Kleen
On Thu, Aug 04, 2005 at 05:20:40PM +0300, Matti Aarnio wrote:
> SHM resources are non-swappable, thus I would not by default

Not true.

> let user programs go and allocate very much SHM spaces at all.
> Such is usually spelled as: "denial-of-service-attack"
> For that reason I would not raise builtin defaults either.

It is equivalent to allocating anymous memory in programs.

In theory you could limit it for each user by RLIMIT_NPROC*RLIMIT_AS,
but in practice that would be usually
If Linux ever gets a "max memory total used per user" rlimit it may make
sense to limit the shm growth caused by them to that, but that is not
there yet. In addition I want to point out that there are a zillion
of subsystems which can be used to allocate quite a lot of memory
(e.g. fill the socket buffers of a few hundred sockets)
So far nobody knows how to limit all of these and it's probably too hard
to do. The general wisdom is that if you want strong isolation like
that use a virtualized environment.

> > 
> > I think we should just get rid of the per process limit and keep
> > the global limit, but make it auto tuning based on available memory.
> 
> Err...  No thanks!   I would prefer to have even finer grained control
> of how much SHM somebody can allocate.  For normal user the value
> might be zero, but for users in a group "SHM1" there could be a level
> of N MB, etc.  (Except that such mechanisms are rather complex...)

shmmni will stay, although the defaults will be larger. If you really
want you can lower it, but in practice it won't buy you much if anything.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Matti Aarnio
On Thu, Aug 04, 2005 at 03:23:38PM +0200, Andi Kleen wrote:
> On Thu, Aug 04, 2005 at 02:19:21PM +0100, Hugh Dickins wrote:
> > On Thu, 4 Aug 2005, Andi Kleen wrote:
> > 
> > > I noticed that even 64bit architectures have a ridiculously low 
> > > max limit on shared memory segments by default:
> > > 
> > > #define SHMMAX 0x200 /* max shared seg size (bytes) */
> > > #define SHMMNI 4096  /* max num of segs system wide */
> > > #define SHMALL (SHMMAX/PAGE_SIZE*(SHMMNI/16)) /* max shm system wide 
> > > (pages) */
> > > 
> > > Even on 32bit architectures it is far too small and doesn't
> > > make much sense. Does anybody remember why we even have this limit?
> > 
> > To be like the UNIXes.
> 
> Ok, no other more fundamental reason  ? :) 
> I cannot think of any at least.

Those supply DEFAULT values for bootup time, and they can be
adjusted with sysctl.   Existence of the limits is good.
Their easy tunability (even easier than at Solaris, where
you tune them only with a reboot) is even better.

SHM resources are non-swappable, thus I would not by default
let user programs go and allocate very much SHM spaces at all.
Such is usually spelled as: "denial-of-service-attack"
For that reason I would not raise builtin defaults either.

...

> 
> I think we should just get rid of the per process limit and keep
> the global limit, but make it auto tuning based on available memory.

Err...  No thanks!   I would prefer to have even finer grained control
of how much SHM somebody can allocate.  For normal user the value
might be zero, but for users in a group "SHM1" there could be a level
of N MB, etc.  (Except that such mechanisms are rather complex...)

For dedicated servers there is no problem of letting there be single
global limit and its default value being in highish realms, but pick
any machine with multiple users running their own programs
Consider all of them hostile (clueless can do as much damage
as any intentionally hostile.)

Mmm...  Apparently X (and/or other parts of the desktop) do ask for
a number of shared memory segments.  Default user allocation limit
can't be zero.


> That is still not very nice because that would likely keep it < available 
> memory/2, but I suspect databases usually want more than that. So
> I would even make it bigger than tmpfs for reasonably big machines.
> Let's say
> 
> if (main memory >= 1GB)
>   maxmem = main memory - main memory/8 
> else  
>   maxmem = main memory / 2
> 
> possible increase the 4096 segments limit too, it seems quite low,
> or also auto tune based on memory.
> 
> One possible problem with getting rid of /proc/sys/kernel/shmmni 
> would be that some programs might read it and fail if it's not available. i
> So I would probably keep it read only but always return LONG_MAX.
>  
> > I don't think my opinion is worth much on this:
> > what would the distro tuners like to see there?
> 
> suse has shipped larger default limits for a long time.
> And all the databases and some other software documents
> increasing these values.

If there were kernels that are optimized for database servers, then
the hard-wired defaults might be risen, of course.  On the other hand,
sysadmin knows for the best, and we have adjustment tools that don't
require kernel recompile, nor even reboot to be effective.


> -Andi

  /Matti Aarnio
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Hugh Dickins
On Thu, 4 Aug 2005, Matti Aarnio wrote:
> 
> SHM resources are non-swappable, thus I would not by default
> let user programs go and allocate very much SHM spaces at all.

No, SHM resources are swappable.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Jakob Oestergaard
On Thu, Aug 04, 2005 at 02:19:21PM +0100, Hugh Dickins wrote:
...
> > Even on 32bit architectures it is far too small and doesn't
> > make much sense. Does anybody remember why we even have this limit?
> 
> To be like the UNIXes.

 :)

...
> Anton proposed raising the limits last autumn, but I was a bit
> discouraging back then, having noticed that even Solaris 9 was more
> restrictive than Linux.  They seem to be ancient traditional limits
> which everyone knows must be raised to get real work done.

As I understand it (and I may be mistaken - if so please let me know) -
the limit is for SVR4 IPC shared memory (shmget() and friends), and not
shared memory in general.

It makes good sense to limit use of the old SVR4 shared memory
ressources, as they're generally administrator hell (doesn't free up
ressources on process exit), and just plain shouldn't be used.

It is my impression that SVR4 shmem is used in very few applications,
and that the low limit is more than sufficient in most cases.

Any proper application that really needs shared memory, can either
memory map /dev/null and share that map (swap backed shared memory) or
memory map a file on disk.

If the above makes sense and isn't too far from the truth, then I guess
that's a pretty good argument for maintaining status quo.

-- 

 / jakob

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Andi Kleen
On Thu, Aug 04, 2005 at 02:19:21PM +0100, Hugh Dickins wrote:
> On Thu, 4 Aug 2005, Andi Kleen wrote:
> 
> > I noticed that even 64bit architectures have a ridiculously low 
> > max limit on shared memory segments by default:
> > 
> > #define SHMMAX 0x200 /* max shared seg size (bytes) */
> > #define SHMMNI 4096  /* max num of segs system wide */
> > #define SHMALL (SHMMAX/PAGE_SIZE*(SHMMNI/16)) /* max shm system wide 
> > (pages) */
> > 
> > Even on 32bit architectures it is far too small and doesn't
> > make much sense. Does anybody remember why we even have this limit?
> 
> To be like the UNIXes.

Ok, no other more fundamental reason  ? :) 
I cannot think of any at least.

> 
> > IMHO per process shm mappings should just be controlled by the normal
> > process and global mappings with the same heuristics as tmpfs
> > (by default max memory / 2 or more if shmfs is mounted with more)
> > Actually I suspect databases will usually want to use more 
> > so it might even make sense to support max memory - 1/8*max_memory
> > 
> > I would propose to get rid of of shmmax completely
> > and only keep the old shmall sysctl for compatibility.
> 
> Anton proposed raising the limits last autumn, but I was a bit
> discouraging back then, having noticed that even Solaris 9 was more
> restrictive than Linux.  They seem to be ancient traditional limits
> which everyone knows must be raised to get real work done.
> 
> It's possible that if we raise the limits, installation
> of this or that application will then lower them again?

I think we should just get rid of the per process limit and keep
the global limit, but make it auto tuning based on available memory.
That is still not very nice because that would likely keep it < available 
memory/2, but I suspect databases usually want more than that. So
I would even make it bigger than tmpfs for reasonably big machines.
Let's say

if (main memory >= 1GB)
maxmem = main memory - main memory/8 
else  
maxmem = main memory / 2

possible increase the 4096 segments limit too, it seems quite low,
or also auto tune based on memory.

One possible problem with getting rid of /proc/sys/kernel/shmmni 
would be that some programs might read it and fail if it's not available. i
So I would probably keep it read only but always return LONG_MAX.

> 
> I don't think my opinion is worth much on this:
> what would the distro tuners like to see there?

suse has shipped larger default limits for a long time.
And all the databases and some other software documents increasing these
values.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Hugh Dickins
On Thu, 4 Aug 2005, Andi Kleen wrote:

> I noticed that even 64bit architectures have a ridiculously low 
> max limit on shared memory segments by default:
> 
> #define SHMMAX 0x200 /* max shared seg size (bytes) */
> #define SHMMNI 4096  /* max num of segs system wide */
> #define SHMALL (SHMMAX/PAGE_SIZE*(SHMMNI/16)) /* max shm system wide (pages) 
> */
> 
> Even on 32bit architectures it is far too small and doesn't
> make much sense. Does anybody remember why we even have this limit?

To be like the UNIXes.

> IMHO per process shm mappings should just be controlled by the normal
> process and global mappings with the same heuristics as tmpfs
> (by default max memory / 2 or more if shmfs is mounted with more)
> Actually I suspect databases will usually want to use more 
> so it might even make sense to support max memory - 1/8*max_memory
> 
> I would propose to get rid of of shmmax completely
> and only keep the old shmall sysctl for compatibility.

Anton proposed raising the limits last autumn, but I was a bit
discouraging back then, having noticed that even Solaris 9 was more
restrictive than Linux.  They seem to be ancient traditional limits
which everyone knows must be raised to get real work done.

It's possible that if we raise the limits, installation
of this or that application will then lower them again?

I don't think my opinion is worth much on this:
what would the distro tuners like to see there?

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread linux-os \(Dick Johnson\)

On Thu, 4 Aug 2005, Andi Kleen wrote:

>
> I noticed that even 64bit architectures have a ridiculously low
> max limit on shared memory segments by default:
>
> #define SHMMAX 0x200 /* max shared seg size (bytes) */
> #define SHMMNI 4096  /* max num of segs system wide */
> #define SHMALL (SHMMAX/PAGE_SIZE*(SHMMNI/16)) /* max shm system wide (pages) 
> */
>
> Even on 32bit architectures it is far too small and doesn't
> make much sense. Does anybody remember why we even have this limit?
>
> IMHO per process shm mappings should just be controlled by the normal
> process and global mappings with the same heuristics as tmpfs
> (by default max memory / 2 or more if shmfs is mounted with more)
> Actually I suspect databases will usually want to use more
> so it might even make sense to support max memory - 1/8*max_memory
>
> I would propose to get rid of of shmmax completely
> and only keep the old shmall sysctl for compatibility.
>
> Comments?
>
> -Andi


It doesn't seem to be used very much. Here's the `grep` of the
entire 2.6.12 source-tree:


size_t  shm_ctlmax = SHMMAX;
./ipc/shm.c
 (actually only bits 25..16 get used since SHMMAX is so low)
./include/asm-m68k/shm.h
KERN_SHMMAX=34, /* long: Maximum shared memory segment */
./include/linux/sysctl.h
  * SHMMAX, SHMMNI and SHMALL are upper limits are defaults which can
#define SHMMAX 0x200 /* max shared seg size (bytes) */
#define SHMALL (SHMMAX/PAGE_SIZE*(SHMMNI/16)) /* max shm system wide (pages) */
./include/linux/shm.h
 (actually only bits 25..16 get used since SHMMAX is so low)
./include/asm-h8300/shm.h
#ifndef SHMMAX
#define SHMMAX  0x003fa000
./include/asm-arm26/shmparam.h
.ctl_name   = KERN_SHMMAX,
./kernel/sysctl.c



Cheers,
Dick Johnson
Penguin : Linux version 2.6.12 on an i686 machine (5537.79 BogoMips).
Warning : 98.36% of all statistics are fiction.
.
I apologize for the following. I tried to kill it with the above dot :


The information transmitted in this message is confidential and may be 
privileged.  Any review, retransmission, dissemination, or other use of this 
information by persons or entities other than the intended recipient is 
prohibited.  If you are not the intended recipient, please notify Analogic 
Corporation immediately - by replying to this message or by sending an email to 
[EMAIL PROTECTED] - and destroy all copies of this information, including any 
attachments, without reading or disclosing them.

Thank you.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Andi Kleen

I noticed that even 64bit architectures have a ridiculously low 
max limit on shared memory segments by default:

#define SHMMAX 0x200 /* max shared seg size (bytes) */
#define SHMMNI 4096  /* max num of segs system wide */
#define SHMALL (SHMMAX/PAGE_SIZE*(SHMMNI/16)) /* max shm system wide (pages) */

Even on 32bit architectures it is far too small and doesn't
make much sense. Does anybody remember why we even have this limit?

IMHO per process shm mappings should just be controlled by the normal
process and global mappings with the same heuristics as tmpfs
(by default max memory / 2 or more if shmfs is mounted with more)
Actually I suspect databases will usually want to use more 
so it might even make sense to support max memory - 1/8*max_memory

I would propose to get rid of of shmmax completely
and only keep the old shmall sysctl for compatibility.

Comments?

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Andi Kleen

I noticed that even 64bit architectures have a ridiculously low 
max limit on shared memory segments by default:

#define SHMMAX 0x200 /* max shared seg size (bytes) */
#define SHMMNI 4096  /* max num of segs system wide */
#define SHMALL (SHMMAX/PAGE_SIZE*(SHMMNI/16)) /* max shm system wide (pages) */

Even on 32bit architectures it is far too small and doesn't
make much sense. Does anybody remember why we even have this limit?

IMHO per process shm mappings should just be controlled by the normal
process and global mappings with the same heuristics as tmpfs
(by default max memory / 2 or more if shmfs is mounted with more)
Actually I suspect databases will usually want to use more 
so it might even make sense to support max memory - 1/8*max_memory

I would propose to get rid of of shmmax completely
and only keep the old shmall sysctl for compatibility.

Comments?

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread linux-os \(Dick Johnson\)

On Thu, 4 Aug 2005, Andi Kleen wrote:


 I noticed that even 64bit architectures have a ridiculously low
 max limit on shared memory segments by default:

 #define SHMMAX 0x200 /* max shared seg size (bytes) */
 #define SHMMNI 4096  /* max num of segs system wide */
 #define SHMALL (SHMMAX/PAGE_SIZE*(SHMMNI/16)) /* max shm system wide (pages) 
 */

 Even on 32bit architectures it is far too small and doesn't
 make much sense. Does anybody remember why we even have this limit?

 IMHO per process shm mappings should just be controlled by the normal
 process and global mappings with the same heuristics as tmpfs
 (by default max memory / 2 or more if shmfs is mounted with more)
 Actually I suspect databases will usually want to use more
 so it might even make sense to support max memory - 1/8*max_memory

 I would propose to get rid of of shmmax completely
 and only keep the old shmall sysctl for compatibility.

 Comments?

 -Andi


It doesn't seem to be used very much. Here's the `grep` of the
entire 2.6.12 source-tree:


size_t  shm_ctlmax = SHMMAX;
./ipc/shm.c
 (actually only bits 25..16 get used since SHMMAX is so low)
./include/asm-m68k/shm.h
KERN_SHMMAX=34, /* long: Maximum shared memory segment */
./include/linux/sysctl.h
  * SHMMAX, SHMMNI and SHMALL are upper limits are defaults which can
#define SHMMAX 0x200 /* max shared seg size (bytes) */
#define SHMALL (SHMMAX/PAGE_SIZE*(SHMMNI/16)) /* max shm system wide (pages) */
./include/linux/shm.h
 (actually only bits 25..16 get used since SHMMAX is so low)
./include/asm-h8300/shm.h
#ifndef SHMMAX
#define SHMMAX  0x003fa000
./include/asm-arm26/shmparam.h
.ctl_name   = KERN_SHMMAX,
./kernel/sysctl.c



Cheers,
Dick Johnson
Penguin : Linux version 2.6.12 on an i686 machine (5537.79 BogoMips).
Warning : 98.36% of all statistics are fiction.
.
I apologize for the following. I tried to kill it with the above dot :


The information transmitted in this message is confidential and may be 
privileged.  Any review, retransmission, dissemination, or other use of this 
information by persons or entities other than the intended recipient is 
prohibited.  If you are not the intended recipient, please notify Analogic 
Corporation immediately - by replying to this message or by sending an email to 
[EMAIL PROTECTED] - and destroy all copies of this information, including any 
attachments, without reading or disclosing them.

Thank you.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Hugh Dickins
On Thu, 4 Aug 2005, Andi Kleen wrote:

 I noticed that even 64bit architectures have a ridiculously low 
 max limit on shared memory segments by default:
 
 #define SHMMAX 0x200 /* max shared seg size (bytes) */
 #define SHMMNI 4096  /* max num of segs system wide */
 #define SHMALL (SHMMAX/PAGE_SIZE*(SHMMNI/16)) /* max shm system wide (pages) 
 */
 
 Even on 32bit architectures it is far too small and doesn't
 make much sense. Does anybody remember why we even have this limit?

To be like the UNIXes.

 IMHO per process shm mappings should just be controlled by the normal
 process and global mappings with the same heuristics as tmpfs
 (by default max memory / 2 or more if shmfs is mounted with more)
 Actually I suspect databases will usually want to use more 
 so it might even make sense to support max memory - 1/8*max_memory
 
 I would propose to get rid of of shmmax completely
 and only keep the old shmall sysctl for compatibility.

Anton proposed raising the limits last autumn, but I was a bit
discouraging back then, having noticed that even Solaris 9 was more
restrictive than Linux.  They seem to be ancient traditional limits
which everyone knows must be raised to get real work done.

It's possible that if we raise the limits, installation
of this or that application will then lower them again?

I don't think my opinion is worth much on this:
what would the distro tuners like to see there?

Hugh
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Andi Kleen
On Thu, Aug 04, 2005 at 02:19:21PM +0100, Hugh Dickins wrote:
 On Thu, 4 Aug 2005, Andi Kleen wrote:
 
  I noticed that even 64bit architectures have a ridiculously low 
  max limit on shared memory segments by default:
  
  #define SHMMAX 0x200 /* max shared seg size (bytes) */
  #define SHMMNI 4096  /* max num of segs system wide */
  #define SHMALL (SHMMAX/PAGE_SIZE*(SHMMNI/16)) /* max shm system wide 
  (pages) */
  
  Even on 32bit architectures it is far too small and doesn't
  make much sense. Does anybody remember why we even have this limit?
 
 To be like the UNIXes.

Ok, no other more fundamental reason  ? :) 
I cannot think of any at least.

 
  IMHO per process shm mappings should just be controlled by the normal
  process and global mappings with the same heuristics as tmpfs
  (by default max memory / 2 or more if shmfs is mounted with more)
  Actually I suspect databases will usually want to use more 
  so it might even make sense to support max memory - 1/8*max_memory
  
  I would propose to get rid of of shmmax completely
  and only keep the old shmall sysctl for compatibility.
 
 Anton proposed raising the limits last autumn, but I was a bit
 discouraging back then, having noticed that even Solaris 9 was more
 restrictive than Linux.  They seem to be ancient traditional limits
 which everyone knows must be raised to get real work done.
 
 It's possible that if we raise the limits, installation
 of this or that application will then lower them again?

I think we should just get rid of the per process limit and keep
the global limit, but make it auto tuning based on available memory.
That is still not very nice because that would likely keep it  available 
memory/2, but I suspect databases usually want more than that. So
I would even make it bigger than tmpfs for reasonably big machines.
Let's say

if (main memory = 1GB)
maxmem = main memory - main memory/8 
else  
maxmem = main memory / 2

possible increase the 4096 segments limit too, it seems quite low,
or also auto tune based on memory.

One possible problem with getting rid of /proc/sys/kernel/shmmni 
would be that some programs might read it and fail if it's not available. i
So I would probably keep it read only but always return LONG_MAX.

 
 I don't think my opinion is worth much on this:
 what would the distro tuners like to see there?

suse has shipped larger default limits for a long time.
And all the databases and some other software documents increasing these
values.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Jakob Oestergaard
On Thu, Aug 04, 2005 at 02:19:21PM +0100, Hugh Dickins wrote:
...
  Even on 32bit architectures it is far too small and doesn't
  make much sense. Does anybody remember why we even have this limit?
 
 To be like the UNIXes.

 :)

...
 Anton proposed raising the limits last autumn, but I was a bit
 discouraging back then, having noticed that even Solaris 9 was more
 restrictive than Linux.  They seem to be ancient traditional limits
 which everyone knows must be raised to get real work done.

As I understand it (and I may be mistaken - if so please let me know) -
the limit is for SVR4 IPC shared memory (shmget() and friends), and not
shared memory in general.

It makes good sense to limit use of the old SVR4 shared memory
ressources, as they're generally administrator hell (doesn't free up
ressources on process exit), and just plain shouldn't be used.

It is my impression that SVR4 shmem is used in very few applications,
and that the low limit is more than sufficient in most cases.

Any proper application that really needs shared memory, can either
memory map /dev/null and share that map (swap backed shared memory) or
memory map a file on disk.

If the above makes sense and isn't too far from the truth, then I guess
that's a pretty good argument for maintaining status quo.

-- 

 / jakob

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Hugh Dickins
On Thu, 4 Aug 2005, Matti Aarnio wrote:
 
 SHM resources are non-swappable, thus I would not by default
 let user programs go and allocate very much SHM spaces at all.

No, SHM resources are swappable.

Hugh
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Matti Aarnio
On Thu, Aug 04, 2005 at 03:23:38PM +0200, Andi Kleen wrote:
 On Thu, Aug 04, 2005 at 02:19:21PM +0100, Hugh Dickins wrote:
  On Thu, 4 Aug 2005, Andi Kleen wrote:
  
   I noticed that even 64bit architectures have a ridiculously low 
   max limit on shared memory segments by default:
   
   #define SHMMAX 0x200 /* max shared seg size (bytes) */
   #define SHMMNI 4096  /* max num of segs system wide */
   #define SHMALL (SHMMAX/PAGE_SIZE*(SHMMNI/16)) /* max shm system wide 
   (pages) */
   
   Even on 32bit architectures it is far too small and doesn't
   make much sense. Does anybody remember why we even have this limit?
  
  To be like the UNIXes.
 
 Ok, no other more fundamental reason  ? :) 
 I cannot think of any at least.

Those supply DEFAULT values for bootup time, and they can be
adjusted with sysctl.   Existence of the limits is good.
Their easy tunability (even easier than at Solaris, where
you tune them only with a reboot) is even better.

SHM resources are non-swappable, thus I would not by default
let user programs go and allocate very much SHM spaces at all.
Such is usually spelled as: denial-of-service-attack
For that reason I would not raise builtin defaults either.

...

 
 I think we should just get rid of the per process limit and keep
 the global limit, but make it auto tuning based on available memory.

Err...  No thanks!   I would prefer to have even finer grained control
of how much SHM somebody can allocate.  For normal user the value
might be zero, but for users in a group SHM1 there could be a level
of N MB, etc.  (Except that such mechanisms are rather complex...)

For dedicated servers there is no problem of letting there be single
global limit and its default value being in highish realms, but pick
any machine with multiple users running their own programs
Consider all of them hostile (clueless can do as much damage
as any intentionally hostile.)

Mmm...  Apparently X (and/or other parts of the desktop) do ask for
a number of shared memory segments.  Default user allocation limit
can't be zero.


 That is still not very nice because that would likely keep it  available 
 memory/2, but I suspect databases usually want more than that. So
 I would even make it bigger than tmpfs for reasonably big machines.
 Let's say
 
 if (main memory = 1GB)
   maxmem = main memory - main memory/8 
 else  
   maxmem = main memory / 2
 
 possible increase the 4096 segments limit too, it seems quite low,
 or also auto tune based on memory.
 
 One possible problem with getting rid of /proc/sys/kernel/shmmni 
 would be that some programs might read it and fail if it's not available. i
 So I would probably keep it read only but always return LONG_MAX.
  
  I don't think my opinion is worth much on this:
  what would the distro tuners like to see there?
 
 suse has shipped larger default limits for a long time.
 And all the databases and some other software documents
 increasing these values.

If there were kernels that are optimized for database servers, then
the hard-wired defaults might be risen, of course.  On the other hand,
sysadmin knows for the best, and we have adjustment tools that don't
require kernel recompile, nor even reboot to be effective.


 -Andi

  /Matti Aarnio
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Andi Kleen
On Thu, Aug 04, 2005 at 05:20:40PM +0300, Matti Aarnio wrote:
 SHM resources are non-swappable, thus I would not by default

Not true.

 let user programs go and allocate very much SHM spaces at all.
 Such is usually spelled as: denial-of-service-attack
 For that reason I would not raise builtin defaults either.

It is equivalent to allocating anymous memory in programs.

In theory you could limit it for each user by RLIMIT_NPROC*RLIMIT_AS,
but in practice that would be usually
If Linux ever gets a max memory total used per user rlimit it may make
sense to limit the shm growth caused by them to that, but that is not
there yet. In addition I want to point out that there are a zillion
of subsystems which can be used to allocate quite a lot of memory
(e.g. fill the socket buffers of a few hundred sockets)
So far nobody knows how to limit all of these and it's probably too hard
to do. The general wisdom is that if you want strong isolation like
that use a virtualized environment.

  
  I think we should just get rid of the per process limit and keep
  the global limit, but make it auto tuning based on available memory.
 
 Err...  No thanks!   I would prefer to have even finer grained control
 of how much SHM somebody can allocate.  For normal user the value
 might be zero, but for users in a group SHM1 there could be a level
 of N MB, etc.  (Except that such mechanisms are rather complex...)

shmmni will stay, although the defaults will be larger. If you really
want you can lower it, but in practice it won't buy you much if anything.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Andi Kleen
On Thu, Aug 04, 2005 at 03:49:37PM -0700, Chen, Kenneth W wrote:
 Andi Kleen wrote on Thursday, August 04, 2005 6:24 AM
  I think we should just get rid of the per process limit and keep
  the global limit, but make it auto tuning based on available memory.
  That is still not very nice because that would likely keep it  available 
  memory/2, but I suspect databases usually want more than that. So
  I would even make it bigger than tmpfs for reasonably big machines.
  Let's say
  
  if (main memory = 1GB)
  maxmem = main memory - main memory/8 
 
 This might be too low on large system.  We usually stress shm pretty hard
 for db application and usually use more than 87% of total memory in just
 one shm segment.  So I prefer either no limit or a tunable.

With large system you mean 32GB right?

I think on a large systems some tuning is reasonable because they likely
have trained admins. I'm more worried on reasonable defaults for the
class of systems with 0-4GB

The /8 was to account for the overhead of page tables and mem_map and
leave some other memory for the system, but you're right it might be less 
with hugetlbfs.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Chen, Kenneth W
Andi Kleen wrote on Thursday, August 04, 2005 6:24 AM
 I think we should just get rid of the per process limit and keep
 the global limit, but make it auto tuning based on available memory.
 That is still not very nice because that would likely keep it  available 
 memory/2, but I suspect databases usually want more than that. So
 I would even make it bigger than tmpfs for reasonably big machines.
 Let's say
 
 if (main memory = 1GB)
   maxmem = main memory - main memory/8 

This might be too low on large system.  We usually stress shm pretty hard
for db application and usually use more than 87% of total memory in just
one shm segment.  So I prefer either no limit or a tunable.

- Ken

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Getting rid of SHMMAX/SHMALL ?

2005-08-04 Thread Chen, Kenneth W
Andi Kleen wrote on Thursday, August 04, 2005 3:54 PM
  This might be too low on large system.  We usually stress shm pretty hard
  for db application and usually use more than 87% of total memory in just
  one shm segment.  So I prefer either no limit or a tunable.
 
 With large system you mean 32GB right?

Yes, between 32 GB - 128 GB.  On larger numa box in the 256 GB and upward,
we have to break shm segment into one per-numa-node and then the limit
should be OK.  I was concerned with SMP box with large memory.

 I think on a large systems some tuning is reasonable because they likely
 have trained admins. I'm more worried on reasonable defaults for the
 class of systems with 0-4GB

Sounds reasonable to me.

- Ken

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/