Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-15 Thread Vincent Lefevre
On 2004-05-16 00:09:28 +0900, GOTO Masanori wrote:
> 2.4 is starting the handover process to 2.6 with its role of stable
> version.  So if you want to fix 2.4 kernel documents, please do.

I posted a message to the linux-kernel mailing-list, but people there
are quite ignorant about the C language.

> I think we reached agreements that this bug is not glibc releated
> problem.  Thanks to Wolfram and Daniel.  I would like to close this
> bug.  Vincent, ok?

OK.

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
Championnat International des Jeux Mathématiques et Logiques, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-15 Thread GOTO Masanori
At Sun, 9 May 2004 19:39:59 +0200,
Vincent Lefevre wrote:
> > Second, I for one would consider the absence of overcommitment as a
> > bad misfeature. Good luck with the kernel bug report.
> 
> The 2.6 kernel seems to have better memory handling (but I couldn't
> try yet). Concerning the 2.4 kernel, the problem seems to be the
> documentation.

2.4 is starting the handover process to 2.6 with its role of stable
version.  So if you want to fix 2.4 kernel documents, please do.

I think we reached agreements that this bug is not glibc releated
problem.  Thanks to Wolfram and Daniel.  I would like to close this
bug.  Vincent, ok?

Regards,
-- gotom





Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-15 Thread Vincent Lefevre
On 2004-05-16 00:09:28 +0900, GOTO Masanori wrote:
> 2.4 is starting the handover process to 2.6 with its role of stable
> version.  So if you want to fix 2.4 kernel documents, please do.

I posted a message to the linux-kernel mailing-list, but people there
are quite ignorant about the C language.

> I think we reached agreements that this bug is not glibc releated
> problem.  Thanks to Wolfram and Daniel.  I would like to close this
> bug.  Vincent, ok?

OK.

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
Championnat International des Jeux Mathématiques et Logiques, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-15 Thread GOTO Masanori
At Sun, 9 May 2004 19:39:59 +0200,
Vincent Lefevre wrote:
> > Second, I for one would consider the absence of overcommitment as a
> > bad misfeature. Good luck with the kernel bug report.
> 
> The 2.6 kernel seems to have better memory handling (but I couldn't
> try yet). Concerning the 2.4 kernel, the problem seems to be the
> documentation.

2.4 is starting the handover process to 2.6 with its role of stable
version.  So if you want to fix 2.4 kernel documents, please do.

I think we reached agreements that this bug is not glibc releated
problem.  Thanks to Wolfram and Daniel.  I would like to close this
bug.  Vincent, ok?

Regards,
-- gotom



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-09 Thread wg
> I am not sure how one could describe a system crippled in this way as 
> "functional".  The limit itself would indeed be "functional",

Ok.  You could have stopped there.  This is going to be my last post
in this thread.

> [begin historical analogy]
> Using ulimit in this way would give one an environment a bit like that 
> of IBM's antique VM operating system for their 360 series of 
> computers, except even worse.  In VM, every user had his own small 
> "virtual disk", and the entire real disk was partitioned among them. 
> So, 200 users on a system with 50 MB of disk space meant each user got 
> exactly 250 KB of disk space, and some users would run out of disk 
> space even if the system disk (seen as an aggregate) was only a few 
> percent full.
> [end historical analogy]

A good analogy!  So, by your reasoning below, all the antiquated Un*x
administrators in the world using quotas on their filesystems should
immediately stop doing so, because it isn't practical!?

> Remember, /etc/initscript doesn't let me set limits on a per-user 
> basis.  Also, remember that ulimit for the number of processes applies 
> only on a per-user basis, not as a pool across the entire number of 
> processes.

Of course ulimits can and must be refined for queue schedulers, for
example.  That is beside the point.

> So let's see.  The system I'm currently running this on (a laptop) has 
> 1 GB of memory, 23 users in /etc/passwd.  The user with the most 
> processes (ignoring kernel processes like [kapmd] that don't take up 
> memory) currently has 61 processes.

Quite a lot.  No swap space?

> If we take these as hard limits, 
> then 1 GB / (61 * 23) = 747 KB of memory per process.  That isn't 
> enough even to run bash, xterm, or dhclient, to say nothing of the X 
> server or Mozilla or Emacs.

So you want 23 users on your laptop to be able to run 61 processes
each?  And under no circumstances do you want to see "out of memory:
killing process..."?  Then, I'm sorry to say: you're out of luck.

>  > For me the alternative is
> > clear: either enjoy the advantages and disadvantages of overcommitment
> > _or_ use "ulimit -v".
> 
> The second of those appears impractical, for the reasons I argue above.

It depends..  But I like overcommitment, too :-)

Regards,
Wolfram.




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-09 Thread Vincent Lefevre
On 2004-05-09 17:03:13 +0200, Wolfram Gloger wrote:
> I know, but see /etc/initscript.  Together with the number of
> processes (also ulimited), it effectively is a limit for the system.
> Not the one you like, obviously, but nevertheless a functional one.

Definitely not realistic. I don't want to set any limit. The machine
is used in particular to do some computations that need a lot of
memory.

> > Wrong remark. Solaris behaves correctly, for instance (i.e. if there
> > is no memory left, malloc() returns 0, without needing to set limits
> > on processes).
> 
> "Correctly"?  First, I doubt that Solaris has no overcommitment -- try
> a test with fork() (it should then fail unless there is enough
> physical memory left for a second copy of _all_ writeable pages of the
> current process).

fork() fails when there isn't enough memory, but this is a feature,
and this can be controlled by the programmer. I used Solaris for
several years and this was quite rare. Also, the processes that
take much memory would probably quit early enough in a nice way.

> Second, I for one would consider the absence of overcommitment as a
> bad misfeature. Good luck with the kernel bug report.

The 2.6 kernel seems to have better memory handling (but I couldn't
try yet). Concerning the 2.4 kernel, the problem seems to be the
documentation.

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
Championnat International des Jeux Mathématiques et Logiques, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-09 Thread Steven Augart
Wolfram Gloger wrote:
[ ] In my documentation, ulimit sets
a limit for the *current* process, not for the whole system.
I know, but see /etc/initscript.  Together with the number of
processes (also ulimited), it effectively is a limit for the system.
Not the one you like, obviously, but nevertheless a functional one.
I am not sure how one could describe a system crippled in this way as 
"functional".  The limit itself would indeed be "functional", but, to 
my best understanding, only in the same way that throwing the computer 
off of the roof of a building would also function to limit the memory 
consumption of processes running on it.  Allow me to explain:

[begin historical analogy]
Using ulimit in this way would give one an environment a bit like that 
of IBM's antique VM operating system for their 360 series of 
computers, except even worse.  In VM, every user had his own small 
"virtual disk", and the entire real disk was partitioned among them. 
So, 200 users on a system with 50 MB of disk space meant each user got 
exactly 250 KB of disk space, and some users would run out of disk 
space even if the system disk (seen as an aggregate) was only a few 
percent full.
[end historical analogy]

Remember, /etc/initscript doesn't let me set limits on a per-user 
basis.  Also, remember that ulimit for the number of processes applies 
only on a per-user basis, not as a pool across the entire number of 
processes.

So let's see.  The system I'm currently running this on (a laptop) has 
1 GB of memory, 23 users in /etc/passwd.  The user with the most 
processes (ignoring kernel processes like [kapmd] that don't take up 
memory) currently has 61 processes.  If we take these as hard limits, 
then 1 GB / (61 * 23) = 747 KB of memory per process.  That isn't 
enough even to run bash, xterm, or dhclient, to say nothing of the X 
server or Mozilla or Emacs.

[...]
> For me the alternative is
clear: either enjoy the advantages and disadvantages of overcommitment
_or_ use "ulimit -v".
The second of those appears impractical, for the reasons I argue above.
Regards,
Wolfram.
Cordially yours,
--Steve
--
Steven Augart
Jikes RVM, a free, open source, Virtual Machine:
http://oss.software.ibm.com/jikesrvm



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-09 Thread Wolfram Gloger
> > Oh, so you're adding/removing physical memory dynamically?
> 
> No. I don't see why you ask this. In my documentation, ulimit sets
> a limit for the *current* process, not for the whole system.

I know, but see /etc/initscript.  Together with the number of
processes (also ulimited), it effectively is a limit for the system.
Not the one you like, obviously, but nevertheless a functional one.

> > Well, either live with that or add sufficient swap space.
> 
> Wrong remark. Solaris behaves correctly, for instance (i.e. if there
> is no memory left, malloc() returns 0, without needing to set limits
> on processes).

"Correctly"?  First, I doubt that Solaris has no overcommitment -- try
a test with fork() (it should then fail unless there is enough
physical memory left for a second copy of _all_ writeable pages of the
current process).  Second, I for one would consider the absence of
overcommitment as a bad misfeature.  Good luck with the kernel bug
report.

This has been beaten to death many times.  For me the alternative is
clear: either enjoy the advantages and disadvantages of overcommitment
_or_ use "ulimit -v".

Regards,
Wolfram.





Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-09 Thread wg
> > In general, if you want malloc to return NULL on Linux in a controlled
> > way, the best advice is to use "ulimit -v" IMHO.
> 
> No, this is really a very bad idea, as this would limit the virtual
> memory, instead of checks being done dynamically.

Oh, so you're adding/removing physical memory dynamically?

> And the memory will
> quickly be exhausted.

Well, either live with that or add sufficient swap space.

Regards,
Wolfram.




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-09 Thread Vincent Lefevre
On 2004-05-09 14:16:13 -, [EMAIL PROTECTED] wrote:
> > > In general, if you want malloc to return NULL on Linux in a controlled
> > > way, the best advice is to use "ulimit -v" IMHO.
> > 
> > No, this is really a very bad idea, as this would limit the virtual
> > memory, instead of checks being done dynamically.
> 
> Oh, so you're adding/removing physical memory dynamically?

No. I don't see why you ask this. In my documentation, ulimit sets
a limit for the *current* process, not for the whole system. If you
have a solution with ulimit -v (that won't set more limitation than
the free memory), then I would be interested...

> > And the memory will quickly be exhausted.
> 
> Well, either live with that or add sufficient swap space.

Wrong remark. Solaris behaves correctly, for instance (i.e. if there
is no memory left, malloc() returns 0, without needing to set limits
on processes).

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
Championnat International des Jeux Mathématiques et Logiques, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-08 Thread Vincent Lefevre
On 2004-05-08 11:10:01 -0400, Daniel Jacobowitz wrote:
> Does strace even show MAP_NORESERVE?  I don't think the mmap call that
> you're looking at is even the one which uses MAP_NORESERVE, since in
> the copy of glibc source I'm looking at, that's only used for
> allocating secondary arenas.

You're right, it's another mmap (one in malloc.c), with
PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS. When I wrote the bug
report, I was assuming that the kernel documentation was correct (and
from that, I thought that the problem could only come from the mmap in
arena.c). Now it appears more clearly that this is a kernel problem...

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
Championnat International des Jeux Mathématiques et Logiques, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-08 Thread Vincent Lefevre
On 2004-05-08 11:10:01 -0400, Daniel Jacobowitz wrote:
> Does strace even show MAP_NORESERVE?  I don't think the mmap call that
> you're looking at is even the one which uses MAP_NORESERVE, since in
> the copy of glibc source I'm looking at, that's only used for
> allocating secondary arenas.

You're right, it's another mmap (one in malloc.c), with
PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS. When I wrote the bug
report, I was assuming that the kernel documentation was correct (and
from that, I thought that the problem could only come from the mmap in
arena.c). Now it appears more clearly that this is a kernel problem...

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
Championnat International des Jeux Mathématiques et Logiques, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-08 Thread Daniel Jacobowitz
On Sat, May 08, 2004 at 05:02:14PM +0200, Vincent Lefevre wrote:
> On 2004-05-08 15:13:40 +0200, Wolfram Gloger wrote:
> > But now concerning the bug report in question: I see no bug.  The
> > MAP_NORESERVE does not matter here at all.  Note that before malloc
> > hands out memory in a region allocated with MAP_NORESERVE, it _must_
> > call mprotect(..., PROT_READ|PROT_WRITE) on a smaller subregion, and
> > _that_ call definitely should be checked by the kernel against
> > overcommitment accounting, as _then_ (and only then) physical memory
> > really is potentially allocated.  I believe this to be the case in
> > Linux.
> 
> But mprotect seems to be never called (strace just shows old_mmap
> calls).

Does strace show mapping with PROT_NONE?  If so and there's no
mprotect, then I'm quite confused - access should fail.

Does strace even show MAP_NORESERVE?  I don't think the mmap call that
you're looking at is even the one which uses MAP_NORESERVE, since in
the copy of glibc source I'm looking at, that's only used for
allocating secondary arenas.

-- 
Daniel Jacobowitz




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-08 Thread Vincent Lefevre
On 2004-05-08 15:13:40 +0200, Wolfram Gloger wrote:
> But now concerning the bug report in question: I see no bug.  The
> MAP_NORESERVE does not matter here at all.  Note that before malloc
> hands out memory in a region allocated with MAP_NORESERVE, it _must_
> call mprotect(..., PROT_READ|PROT_WRITE) on a smaller subregion, and
> _that_ call definitely should be checked by the kernel against
> overcommitment accounting, as _then_ (and only then) physical memory
> really is potentially allocated.  I believe this to be the case in
> Linux.

But mprotect seems to be never called (strace just shows old_mmap
calls).

> In general, if you want malloc to return NULL on Linux in a controlled
> way, the best advice is to use "ulimit -v" IMHO.

No, this is really a very bad idea, as this would limit the virtual
memory, instead of checks being done dynamically. And the memory will
quickly be exhausted. Well, unless ulimit does something special, I
don't see how it can be used to globally solve the malloc problem in
practice.

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
Championnat International des Jeux Mathématiques et Logiques, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-08 Thread Daniel Jacobowitz
On Sat, May 08, 2004 at 05:02:14PM +0200, Vincent Lefevre wrote:
> On 2004-05-08 15:13:40 +0200, Wolfram Gloger wrote:
> > But now concerning the bug report in question: I see no bug.  The
> > MAP_NORESERVE does not matter here at all.  Note that before malloc
> > hands out memory in a region allocated with MAP_NORESERVE, it _must_
> > call mprotect(..., PROT_READ|PROT_WRITE) on a smaller subregion, and
> > _that_ call definitely should be checked by the kernel against
> > overcommitment accounting, as _then_ (and only then) physical memory
> > really is potentially allocated.  I believe this to be the case in
> > Linux.
> 
> But mprotect seems to be never called (strace just shows old_mmap
> calls).

Does strace show mapping with PROT_NONE?  If so and there's no
mprotect, then I'm quite confused - access should fail.

Does strace even show MAP_NORESERVE?  I don't think the mmap call that
you're looking at is even the one which uses MAP_NORESERVE, since in
the copy of glibc source I'm looking at, that's only used for
allocating secondary arenas.

-- 
Daniel Jacobowitz


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-08 Thread Vincent Lefevre
On 2004-05-08 15:13:40 +0200, Wolfram Gloger wrote:
> But now concerning the bug report in question: I see no bug.  The
> MAP_NORESERVE does not matter here at all.  Note that before malloc
> hands out memory in a region allocated with MAP_NORESERVE, it _must_
> call mprotect(..., PROT_READ|PROT_WRITE) on a smaller subregion, and
> _that_ call definitely should be checked by the kernel against
> overcommitment accounting, as _then_ (and only then) physical memory
> really is potentially allocated.  I believe this to be the case in
> Linux.

But mprotect seems to be never called (strace just shows old_mmap
calls).

> In general, if you want malloc to return NULL on Linux in a controlled
> way, the best advice is to use "ulimit -v" IMHO.

No, this is really a very bad idea, as this would limit the virtual
memory, instead of checks being done dynamically. And the memory will
quickly be exhausted. Well, unless ulimit does something special, I
don't see how it can be used to globally solve the malloc problem in
practice.

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
Championnat International des Jeux Mathématiques et Logiques, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-08 Thread Wolfram Gloger
Hi,

> Since I don't know enough to continue this discussion, I'm copying
> someone who does!

I hope I can clear this up..

> Wolfram, is there documentation describing the choice of MAP_NORESERVE
> in glibc's malloc, or do you know someone else I should ask?  Thanks in
> advance.

The intention here is to avoid overcommitment accounting for the
mapping with MAP_NORESERVE in malloc.  Since the prot argument for
this mapping is set to PROT_NONE, this is just a second line of
defense, however, as a mapping with this protection mode should not
count against any overcommitment accounting anyway.

I thought that Linux not implementing MAP_NORESERVE did not matter
because it looked at the prot argument before accounting, but I found
the MAP_NORESERVE flag useful e.g. on SGI.

So it's definitely intentional, but not _required_ for correct
malloc operation.  If the creation of a new arena fails, the main
arena is tried secondarily (in fact, this is like a third line
of defense).

But now concerning the bug report in question: I see no bug.  The
MAP_NORESERVE does not matter here at all.  Note that before malloc
hands out memory in a region allocated with MAP_NORESERVE, it _must_
call mprotect(..., PROT_READ|PROT_WRITE) on a smaller subregion, and
_that_ call definitely should be checked by the kernel against
overcommitment accounting, as _then_ (and only then) physical memory
really is potentially allocated.  I believe this to be the case in
Linux.

In general, if you want malloc to return NULL on Linux in a controlled
way, the best advice is to use "ulimit -v" IMHO.

Regards,
Wolfram.





Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-08 Thread Wolfram Gloger
Hi,

> Since I don't know enough to continue this discussion, I'm copying
> someone who does!

I hope I can clear this up..

> Wolfram, is there documentation describing the choice of MAP_NORESERVE
> in glibc's malloc, or do you know someone else I should ask?  Thanks in
> advance.

The intention here is to avoid overcommitment accounting for the
mapping with MAP_NORESERVE in malloc.  Since the prot argument for
this mapping is set to PROT_NONE, this is just a second line of
defense, however, as a mapping with this protection mode should not
count against any overcommitment accounting anyway.

I thought that Linux not implementing MAP_NORESERVE did not matter
because it looked at the prot argument before accounting, but I found
the MAP_NORESERVE flag useful e.g. on SGI.

So it's definitely intentional, but not _required_ for correct
malloc operation.  If the creation of a new arena fails, the main
arena is tried secondarily (in fact, this is like a third line
of defense).

But now concerning the bug report in question: I see no bug.  The
MAP_NORESERVE does not matter here at all.  Note that before malloc
hands out memory in a region allocated with MAP_NORESERVE, it _must_
call mprotect(..., PROT_READ|PROT_WRITE) on a smaller subregion, and
_that_ call definitely should be checked by the kernel against
overcommitment accounting, as _then_ (and only then) physical memory
really is potentially allocated.  I believe this to be the case in
Linux.

In general, if you want malloc to return NULL on Linux in a controlled
way, the best advice is to use "ulimit -v" IMHO.

Regards,
Wolfram.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-07 Thread Daniel Jacobowitz
On Fri, May 07, 2004 at 09:25:11AM +0200, Vincent Lefevre wrote:
> On 2004-05-06 19:05:53 -0400, Daniel Jacobowitz wrote:
> > On Fri, May 07, 2004 at 12:58:28AM +0200, Vincent Lefevre wrote:
> > > On 2004-05-06 16:44:54 -0400, Daniel Jacobowitz wrote:
> > > > The documentation in the 2.4 kernel is, indeed, wrong IIRC.
> > > > 
> > > > BTW, from 2.6:
> > > >   In mode 2 the MAP_NORESERVE flag is ignored. 
> > > 
> > > This is precisely what I thought to be the bug in glibc: the fact
> > > (at least with the 2.4 kernel) that this flag is used by glibc.
> > > On which documentation is glibc based (in particular, concerning
> > > old_mmap)?
> > 
> > I don't know.  The comments suggest malloc made the choice
> > deliberately, though, so I don't think it's appropriate to reverse it.
> 
> But there are no explanations concerning this choice. And without
> explanations, it is difficult to say if this is right or wrong.
> 
> The malloc(3) man page more or less says that this is a kernel bug
> and a kernel with strict overcommit is necessary. But this is a
> contradiction with the 2.4 kernel documentation, which says that
> the kernel checks if there's enough memory left (if the memory is
> really reserved -- so the problem of reserving memory comes from
> the caller, here glibc).

Since I don't know enough to continue this discussion, I'm copying
someone who does!

Wolfram, is there documentation describing the choice of MAP_NORESERVE
in glibc's malloc, or do you know someone else I should ask?  Thanks in
advance.

-- 
Daniel Jacobowitz




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-07 Thread Daniel Jacobowitz
On Fri, May 07, 2004 at 09:25:11AM +0200, Vincent Lefevre wrote:
> On 2004-05-06 19:05:53 -0400, Daniel Jacobowitz wrote:
> > On Fri, May 07, 2004 at 12:58:28AM +0200, Vincent Lefevre wrote:
> > > On 2004-05-06 16:44:54 -0400, Daniel Jacobowitz wrote:
> > > > The documentation in the 2.4 kernel is, indeed, wrong IIRC.
> > > > 
> > > > BTW, from 2.6:
> > > >   In mode 2 the MAP_NORESERVE flag is ignored. 
> > > 
> > > This is precisely what I thought to be the bug in glibc: the fact
> > > (at least with the 2.4 kernel) that this flag is used by glibc.
> > > On which documentation is glibc based (in particular, concerning
> > > old_mmap)?
> > 
> > I don't know.  The comments suggest malloc made the choice
> > deliberately, though, so I don't think it's appropriate to reverse it.
> 
> But there are no explanations concerning this choice. And without
> explanations, it is difficult to say if this is right or wrong.
> 
> The malloc(3) man page more or less says that this is a kernel bug
> and a kernel with strict overcommit is necessary. But this is a
> contradiction with the 2.4 kernel documentation, which says that
> the kernel checks if there's enough memory left (if the memory is
> really reserved -- so the problem of reserving memory comes from
> the caller, here glibc).

Since I don't know enough to continue this discussion, I'm copying
someone who does!

Wolfram, is there documentation describing the choice of MAP_NORESERVE
in glibc's malloc, or do you know someone else I should ask?  Thanks in
advance.

-- 
Daniel Jacobowitz


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-07 Thread Vincent Lefevre
On 2004-05-06 19:05:53 -0400, Daniel Jacobowitz wrote:
> On Fri, May 07, 2004 at 12:58:28AM +0200, Vincent Lefevre wrote:
> > On 2004-05-06 16:44:54 -0400, Daniel Jacobowitz wrote:
> > > The documentation in the 2.4 kernel is, indeed, wrong IIRC.
> > > 
> > > BTW, from 2.6:
> > >   In mode 2 the MAP_NORESERVE flag is ignored. 
> > 
> > This is precisely what I thought to be the bug in glibc: the fact
> > (at least with the 2.4 kernel) that this flag is used by glibc.
> > On which documentation is glibc based (in particular, concerning
> > old_mmap)?
> 
> I don't know.  The comments suggest malloc made the choice
> deliberately, though, so I don't think it's appropriate to reverse it.

But there are no explanations concerning this choice. And without
explanations, it is difficult to say if this is right or wrong.

The malloc(3) man page more or less says that this is a kernel bug
and a kernel with strict overcommit is necessary. But this is a
contradiction with the 2.4 kernel documentation, which says that
the kernel checks if there's enough memory left (if the memory is
really reserved -- so the problem of reserving memory comes from
the caller, here glibc).

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
Championnat International des Jeux Mathématiques et Logiques, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-07 Thread Vincent Lefevre
On 2004-05-06 19:05:53 -0400, Daniel Jacobowitz wrote:
> On Fri, May 07, 2004 at 12:58:28AM +0200, Vincent Lefevre wrote:
> > On 2004-05-06 16:44:54 -0400, Daniel Jacobowitz wrote:
> > > The documentation in the 2.4 kernel is, indeed, wrong IIRC.
> > > 
> > > BTW, from 2.6:
> > >   In mode 2 the MAP_NORESERVE flag is ignored. 
> > 
> > This is precisely what I thought to be the bug in glibc: the fact
> > (at least with the 2.4 kernel) that this flag is used by glibc.
> > On which documentation is glibc based (in particular, concerning
> > old_mmap)?
> 
> I don't know.  The comments suggest malloc made the choice
> deliberately, though, so I don't think it's appropriate to reverse it.

But there are no explanations concerning this choice. And without
explanations, it is difficult to say if this is right or wrong.

The malloc(3) man page more or less says that this is a kernel bug
and a kernel with strict overcommit is necessary. But this is a
contradiction with the 2.4 kernel documentation, which says that
the kernel checks if there's enough memory left (if the memory is
really reserved -- so the problem of reserving memory comes from
the caller, here glibc).

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
Championnat International des Jeux Mathématiques et Logiques, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-06 Thread Steven Augart
Daniel Jacobowitz wrote:
On Fri, May 07, 2004 at 12:58:28AM +0200, Vincent Lefevre wrote:
[...]
This is precisely what I thought to be the bug in glibc: the fact
(at least with the 2.4 kernel) that this flag is used by glibc.
On which documentation is glibc based (in particular, concerning
old_mmap)?

I don't know.  The comments suggest malloc made the choice
deliberately, though, so I don't think it's appropriate to reverse it.
For what it's worth, this is in the malloc(3) manual page on my
Debian/sarge system (from the package manpages-dev-1.66-1)
BUGS
   By default, Linux follows an  optimistic  memory  allocation  strategy.
   This  means  that  when malloc() returns non-NULL there is no guarantee
   that the memory really is available. This is a really bad bug.  In case
   it  turns  out  that the system is out of memory, one or more processes
   will be killed by the infamous OOM killer.  In case Linux  is  employed
   under  circumstances  where it would be less desirable to suddenly lose
   some randomly picked processes, and moreover the kernel version is suf-
   ficiently recent, one can switch off this overcommitting behavior using
   a command like
  # echo 2 > /proc/sys/vm/overcommit_memory
   See also  the  kernel  Documentation  directory,  files  vm/overcommit-
   accounting and sysctl/vm.txt.
If somebody would like to submit a patch to manpages-dev, I'm sure it 
would save grief for other developers to have it clearly spelled out 
that glibc's malloc() uses the MAP_NORESERVE flag when allocating 
memory.   It would tell people what to look for when reading the 
kernel documentation referred to.

--Steve Augart
--
Steven Augart
Jikes RVM, a free, open source, Virtual Machine:
http://oss.software.ibm.com/jikesrvm



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-06 Thread Steven Augart
Daniel Jacobowitz wrote:
On Fri, May 07, 2004 at 12:58:28AM +0200, Vincent Lefevre wrote:
[...]
This is precisely what I thought to be the bug in glibc: the fact
(at least with the 2.4 kernel) that this flag is used by glibc.
On which documentation is glibc based (in particular, concerning
old_mmap)?

I don't know.  The comments suggest malloc made the choice
deliberately, though, so I don't think it's appropriate to reverse it.
For what it's worth, this is in the malloc(3) manual page on my
Debian/sarge system (from the package manpages-dev-1.66-1)
BUGS
   By default, Linux follows an  optimistic  memory  allocation  strategy.
   This  means  that  when malloc() returns non-NULL there is no guarantee
   that the memory really is available. This is a really bad bug.  In case
   it  turns  out  that the system is out of memory, one or more processes
   will be killed by the infamous OOM killer.  In case Linux  is  employed
   under  circumstances  where it would be less desirable to suddenly lose
   some randomly picked processes, and moreover the kernel version is suf-
   ficiently recent, one can switch off this overcommitting behavior using
   a command like
  # echo 2 > /proc/sys/vm/overcommit_memory
   See also  the  kernel  Documentation  directory,  files  vm/overcommit-
   accounting and sysctl/vm.txt.
If somebody would like to submit a patch to manpages-dev, I'm sure it 
would save grief for other developers to have it clearly spelled out 
that glibc's malloc() uses the MAP_NORESERVE flag when allocating 
memory.   It would tell people what to look for when reading the 
kernel documentation referred to.

--Steve Augart
--
Steven Augart
Jikes RVM, a free, open source, Virtual Machine:
http://oss.software.ibm.com/jikesrvm
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]


Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-06 Thread Daniel Jacobowitz
On Fri, May 07, 2004 at 12:58:28AM +0200, Vincent Lefevre wrote:
> On 2004-05-06 16:44:54 -0400, Daniel Jacobowitz wrote:
> > The documentation in the 2.4 kernel is, indeed, wrong IIRC.
> > 
> > BTW, from 2.6:
> >   In mode 2 the MAP_NORESERVE flag is ignored. 
> 
> This is precisely what I thought to be the bug in glibc: the fact
> (at least with the 2.4 kernel) that this flag is used by glibc.
> On which documentation is glibc based (in particular, concerning
> old_mmap)?

I don't know.  The comments suggest malloc made the choice
deliberately, though, so I don't think it's appropriate to reverse it.

-- 
Daniel Jacobowitz




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-06 Thread Vincent Lefevre
On 2004-05-06 16:44:54 -0400, Daniel Jacobowitz wrote:
> The documentation in the 2.4 kernel is, indeed, wrong IIRC.
> 
> BTW, from 2.6:
>   In mode 2 the MAP_NORESERVE flag is ignored. 

This is precisely what I thought to be the bug in glibc: the fact
(at least with the 2.4 kernel) that this flag is used by glibc.
On which documentation is glibc based (in particular, concerning
old_mmap)?

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
Championnat International des Jeux Mathématiques et Logiques, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-06 Thread Daniel Jacobowitz
On Fri, May 07, 2004 at 12:58:28AM +0200, Vincent Lefevre wrote:
> On 2004-05-06 16:44:54 -0400, Daniel Jacobowitz wrote:
> > The documentation in the 2.4 kernel is, indeed, wrong IIRC.
> > 
> > BTW, from 2.6:
> >   In mode 2 the MAP_NORESERVE flag is ignored. 
> 
> This is precisely what I thought to be the bug in glibc: the fact
> (at least with the 2.4 kernel) that this flag is used by glibc.
> On which documentation is glibc based (in particular, concerning
> old_mmap)?

I don't know.  The comments suggest malloc made the choice
deliberately, though, so I don't think it's appropriate to reverse it.

-- 
Daniel Jacobowitz


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-06 Thread Vincent Lefevre
On 2004-05-06 16:44:54 -0400, Daniel Jacobowitz wrote:
> The documentation in the 2.4 kernel is, indeed, wrong IIRC.
> 
> BTW, from 2.6:
>   In mode 2 the MAP_NORESERVE flag is ignored. 

This is precisely what I thought to be the bug in glibc: the fact
(at least with the 2.4 kernel) that this flag is used by glibc.
On which documentation is glibc based (in particular, concerning
old_mmap)?

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
Championnat International des Jeux Mathématiques et Logiques, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-06 Thread Vincent Lefevre
On 2004-05-06 12:25:19 -0400, Daniel Jacobowitz wrote:
> Overcommit does not work properly in 2.4, though.  GOTO-san is right -
> from your description you want strict overcommit, i.e. the value of 2
> for this flag.  Some of the 2.4-ac kernels had this.  So does 2.6.

This is strange, because 3 different users[*] told me that the 2.4
kernel was OK for me. And according to the 2.4 vm.txt documentation,
setting the overcommit_memory flag to 0 in the 2.4 kernel should be
sufficient (otherwise the documentation is plainly wrong).

[*] Msgid <[EMAIL PROTECTED]>
and <[EMAIL PROTECTED]> for the first two, and in a
private forum for the last one (and this one said that this was working
in the past with an official 2.4 kernel, i.e. non-patched).

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
Championnat International des Jeux Mathématiques et Logiques, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-06 Thread Daniel Jacobowitz
On Thu, May 06, 2004 at 10:34:05PM +0200, Vincent Lefevre wrote:
> On 2004-05-06 12:25:19 -0400, Daniel Jacobowitz wrote:
> > Overcommit does not work properly in 2.4, though.  GOTO-san is right -
> > from your description you want strict overcommit, i.e. the value of 2
> > for this flag.  Some of the 2.4-ac kernels had this.  So does 2.6.
> 
> This is strange, because 3 different users[*] told me that the 2.4
> kernel was OK for me. And according to the 2.4 vm.txt documentation,
> setting the overcommit_memory flag to 0 in the 2.4 kernel should be
> sufficient (otherwise the documentation is plainly wrong).
> 
> [*] Msgid <[EMAIL PROTECTED]>
> and <[EMAIL PROTECTED]> for the first two, and in a
> private forum for the last one (and this one said that this was working
> in the past with an official 2.4 kernel, i.e. non-patched).

This is strange, because many people at MV have told me the opposite.

At least one of the people who suggested it to you (2nd message-ID
above) was using a patched distro kernel from one of the Enterprise
distributions.  They've all done local work in this area during the 2.4
timeframe.

The documentation in the 2.4 kernel is, indeed, wrong IIRC.

BTW, from 2.6:
  In mode 2 the MAP_NORESERVE flag is ignored. 


-- 
Daniel Jacobowitz




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-06 Thread Vincent Lefevre
On 2004-05-06 12:25:19 -0400, Daniel Jacobowitz wrote:
> Overcommit does not work properly in 2.4, though.  GOTO-san is right -
> from your description you want strict overcommit, i.e. the value of 2
> for this flag.  Some of the 2.4-ac kernels had this.  So does 2.6.

This is strange, because 3 different users[*] told me that the 2.4
kernel was OK for me. And according to the 2.4 vm.txt documentation,
setting the overcommit_memory flag to 0 in the 2.4 kernel should be
sufficient (otherwise the documentation is plainly wrong).

[*] Msgid <[EMAIL PROTECTED]>
and <[EMAIL PROTECTED]> for the first two, and in a
private forum for the last one (and this one said that this was working
in the past with an official 2.4 kernel, i.e. non-patched).

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
Championnat International des Jeux Mathématiques et Logiques, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-06 Thread Daniel Jacobowitz
On Thu, May 06, 2004 at 10:34:05PM +0200, Vincent Lefevre wrote:
> On 2004-05-06 12:25:19 -0400, Daniel Jacobowitz wrote:
> > Overcommit does not work properly in 2.4, though.  GOTO-san is right -
> > from your description you want strict overcommit, i.e. the value of 2
> > for this flag.  Some of the 2.4-ac kernels had this.  So does 2.6.
> 
> This is strange, because 3 different users[*] told me that the 2.4
> kernel was OK for me. And according to the 2.4 vm.txt documentation,
> setting the overcommit_memory flag to 0 in the 2.4 kernel should be
> sufficient (otherwise the documentation is plainly wrong).
> 
> [*] Msgid <[EMAIL PROTECTED]>
> and <[EMAIL PROTECTED]> for the first two, and in a
> private forum for the last one (and this one said that this was working
> in the past with an official 2.4 kernel, i.e. non-patched).

This is strange, because many people at MV have told me the opposite.

At least one of the people who suggested it to you (2nd message-ID
above) was using a patched distro kernel from one of the Enterprise
distributions.  They've all done local work in this area during the 2.4
timeframe.

The documentation in the 2.4 kernel is, indeed, wrong IIRC.

BTW, from 2.6:
  In mode 2 the MAP_NORESERVE flag is ignored. 


-- 
Daniel Jacobowitz


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-06 Thread Daniel Jacobowitz
On Thu, May 06, 2004 at 09:05:29AM +0200, Vincent Lefevre wrote:
> In the 2.4 kernel documentation, Documentation/sysctl/vm.txt says:
> 
> overcommit_memory:
> 
> This value contains a flag that enables memory overcommitment.
> When this flag is 0, the kernel checks before each malloc()
> to see if there's enough memory left. If the flag is nonzero,
> the system pretends there's always enough memory.
> 
> Isn't it clear?
> 
> So, according to linux specification, the kernel does the check if
> overcommit_memory is 0 (my case)... unless the caller asks not to
> reserve (but malloc() is a reservation, so I don't see why glibc
> sets the MAP_NORESERVE flag, if I've understood correctly).

Overcommit does not work properly in 2.4, though.  GOTO-san is right -
from your description you want strict overcommit, i.e. the value of 2
for this flag.  Some of the 2.4-ac kernels had this.  So does 2.6.

The description from 2.6 says:
The Linux kernel supports three overcommit handling modes

0   -   Heuristic overcommit handling. Obvious overcommits of
address space are refused. Used for a typical system. It
ensures a seriously wild allocation fails while allowing
overcommit to reduce swap usage.  root is allowed to 
allocate slighly more memory in this mode. This is the 
default.

1   -   No overcommit handling. Appropriate for some scientific
applications.

2   -   (NEW) strict overcommit. The total address space commit
for the system is not permitted to exceed swap + a
configurable percentage (default is 50) of physical RAM.
Depending on the percentage you use, in most situations
this means a process will not be killed while accessing
pages but will receive errors on memory allocation as
appropriate.

-- 
Daniel Jacobowitz




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-06 Thread Daniel Jacobowitz
On Thu, May 06, 2004 at 09:05:29AM +0200, Vincent Lefevre wrote:
> In the 2.4 kernel documentation, Documentation/sysctl/vm.txt says:
> 
> overcommit_memory:
> 
> This value contains a flag that enables memory overcommitment.
> When this flag is 0, the kernel checks before each malloc()
> to see if there's enough memory left. If the flag is nonzero,
> the system pretends there's always enough memory.
> 
> Isn't it clear?
> 
> So, according to linux specification, the kernel does the check if
> overcommit_memory is 0 (my case)... unless the caller asks not to
> reserve (but malloc() is a reservation, so I don't see why glibc
> sets the MAP_NORESERVE flag, if I've understood correctly).

Overcommit does not work properly in 2.4, though.  GOTO-san is right -
from your description you want strict overcommit, i.e. the value of 2
for this flag.  Some of the 2.4-ac kernels had this.  So does 2.6.

The description from 2.6 says:
The Linux kernel supports three overcommit handling modes

0   -   Heuristic overcommit handling. Obvious overcommits of
address space are refused. Used for a typical system. It
ensures a seriously wild allocation fails while allowing
overcommit to reduce swap usage.  root is allowed to 
allocate slighly more memory in this mode. This is the 
default.

1   -   No overcommit handling. Appropriate for some scientific
applications.

2   -   (NEW) strict overcommit. The total address space commit
for the system is not permitted to exceed swap + a
configurable percentage (default is 50) of physical RAM.
Depending on the percentage you use, in most situations
this means a process will not be killed while accessing
pages but will receive errors on memory allocation as
appropriate.

-- 
Daniel Jacobowitz


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-06 Thread Vincent Lefevre
On 2004-05-06 13:45:45 +0900, GOTO Masanori wrote:
> IIRC, this kind of OOM (Out Of Memory) situation has been discussed at
> linux kernel related lists for a long time.  Your sample code repeats
> to issue mmap() and to fill the acquired pages until it got problem.
> So kernel needs to detect the phisycal memory exhaustion before
> allocating virtual memory.  Please imagine how to do it (to deal with
> vairous types of memories like mmap)?  Yeah, welcome to VM world.

The 2.4 kernel does have such a function:

int vm_enough_memory(long pages)

in mm/mmap.c; looking at strace output, the glibc calls old_mmap,
which has a call to do_mmap_pgoff, which contains the check here:

/* Private writable mapping? Check memory availability.. */
if ((vm_flags & (VM_SHARED | VM_WRITE)) == VM_WRITE &&
!(flags & MAP_NORESERVE) &&
!vm_enough_memory(len >> PAGE_SHIFT))
return -ENOMEM;

But the check is done only when the MAP_NORESERVE flag isn't set.

> AFAIK, in kernel 2.6, there is strictly overcommit mode
> (/proc/sys/vm/overcommit_memory = 2) to prevent from getting sigkill
> by OOM killer and so on.  It helps you that malloc() should be return
> with the limit (see /proc/sys/vm/overcommit_ratio).
[...]
> I don't think it's glibc bug.  It's linux specification.

In the 2.4 kernel documentation, Documentation/sysctl/vm.txt says:

overcommit_memory:

This value contains a flag that enables memory overcommitment.
When this flag is 0, the kernel checks before each malloc()
to see if there's enough memory left. If the flag is nonzero,
the system pretends there's always enough memory.

Isn't it clear?

So, according to linux specification, the kernel does the check if
overcommit_memory is 0 (my case)... unless the caller asks not to
reserve (but malloc() is a reservation, so I don't see why glibc
sets the MAP_NORESERVE flag, if I've understood correctly).

> I would like to close this bug, ok? If you want to discuss about
> this issue more, I recommend you lists:
> linux-kernel/linux-mm/kernelnewbies that are good place.

Please, don't close the bug now. See my explanations above. It you
think that glibc is doing the right thing, then the source should
probably be more documented.

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
Championnat International des Jeux Mathématiques et Logiques, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-06 Thread Vincent Lefevre
On 2004-05-06 13:45:45 +0900, GOTO Masanori wrote:
> IIRC, this kind of OOM (Out Of Memory) situation has been discussed at
> linux kernel related lists for a long time.  Your sample code repeats
> to issue mmap() and to fill the acquired pages until it got problem.
> So kernel needs to detect the phisycal memory exhaustion before
> allocating virtual memory.  Please imagine how to do it (to deal with
> vairous types of memories like mmap)?  Yeah, welcome to VM world.

The 2.4 kernel does have such a function:

int vm_enough_memory(long pages)

in mm/mmap.c; looking at strace output, the glibc calls old_mmap,
which has a call to do_mmap_pgoff, which contains the check here:

/* Private writable mapping? Check memory availability.. */
if ((vm_flags & (VM_SHARED | VM_WRITE)) == VM_WRITE &&
!(flags & MAP_NORESERVE) &&
!vm_enough_memory(len >> PAGE_SHIFT))
return -ENOMEM;

But the check is done only when the MAP_NORESERVE flag isn't set.

> AFAIK, in kernel 2.6, there is strictly overcommit mode
> (/proc/sys/vm/overcommit_memory = 2) to prevent from getting sigkill
> by OOM killer and so on.  It helps you that malloc() should be return
> with the limit (see /proc/sys/vm/overcommit_ratio).
[...]
> I don't think it's glibc bug.  It's linux specification.

In the 2.4 kernel documentation, Documentation/sysctl/vm.txt says:

overcommit_memory:

This value contains a flag that enables memory overcommitment.
When this flag is 0, the kernel checks before each malloc()
to see if there's enough memory left. If the flag is nonzero,
the system pretends there's always enough memory.

Isn't it clear?

So, according to linux specification, the kernel does the check if
overcommit_memory is 0 (my case)... unless the caller asks not to
reserve (but malloc() is a reservation, so I don't see why glibc
sets the MAP_NORESERVE flag, if I've understood correctly).

> I would like to close this bug, ok? If you want to discuss about
> this issue more, I recommend you lists:
> linux-kernel/linux-mm/kernelnewbies that are good place.

Please, don't close the bug now. See my explanations above. It you
think that glibc is doing the right thing, then the source should
probably be more documented.

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
Championnat International des Jeux Mathématiques et Logiques, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-06 Thread GOTO Masanori
severity 247300 normal
thanks

At Tue, 4 May 2004 14:41:22 +0200,
Vincent Lefevre wrote:
> I've set the overcommit to 0, and the malloc() function never fails,
> even when there isn't enough memory left, making processes crash when
> they need memory they have already allocated.

"never" is actually not true.  Change ONEMB larger.  malloc() should
become to be failed.

> I'm not sure whether this is a libc6 or a kernel bug. I'm not familiar
> with the glibc source, but could this be due to the following code?

IIRC, this kind of OOM (Out Of Memory) situation has been discussed at
linux kernel related lists for a long time.  Your sample code repeats
to issue mmap() and to fill the acquired pages until it got problem.
So kernel needs to detect the phisycal memory exhaustion before
allocating virtual memory.  Please imagine how to do it (to deal with
vairous types of memories like mmap)?  Yeah, welcome to VM world.

AFAIK, in kernel 2.6, there is strictly overcommit mode
(/proc/sys/vm/overcommit_memory = 2) to prevent from getting sigkill
by OOM killer and so on.  It helps you that malloc() should be return
with the limit (see /proc/sys/vm/overcommit_ratio).

Rik van Riel and et al released various MM improvement patches.
Please check them if you keep to use kernel 2.4.

> malloc/arena.c:
> 
>   /* A memory region aligned to a multiple of HEAP_MAX_SIZE is needed.
>  No swap space needs to be reserved for the following large
>  mapping (on Linux, this is the case for all non-writable mappings
>  anyway). */
>   p1 = (char *)MMAP(0, HEAP_MAX_SIZE<<1, PROT_NONE, 
> MAP_PRIVATE|MAP_NORESERVE);
> ^

Use strace.


I don't think it's glibc bug.  It's linux specification.  I would like
to close this bug, ok?  If you want to discuss about this issue more,
I recommend you lists: linux-kernel/linux-mm/kernelnewbies that are
good place.

Regards,
-- gotom




Processed: Re: Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-06 Thread Debian Bug Tracking System
Processing commands for [EMAIL PROTECTED]:

> severity 247300 normal
Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash
Severity set to `normal'.

> thanks
Stopping processing here.

Please contact me if you need assistance.

Debian bug tracking system administrator
(administrator, Debian Bugs database)




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-05 Thread GOTO Masanori
severity 247300 normal
thanks

At Tue, 4 May 2004 14:41:22 +0200,
Vincent Lefevre wrote:
> I've set the overcommit to 0, and the malloc() function never fails,
> even when there isn't enough memory left, making processes crash when
> they need memory they have already allocated.

"never" is actually not true.  Change ONEMB larger.  malloc() should
become to be failed.

> I'm not sure whether this is a libc6 or a kernel bug. I'm not familiar
> with the glibc source, but could this be due to the following code?

IIRC, this kind of OOM (Out Of Memory) situation has been discussed at
linux kernel related lists for a long time.  Your sample code repeats
to issue mmap() and to fill the acquired pages until it got problem.
So kernel needs to detect the phisycal memory exhaustion before
allocating virtual memory.  Please imagine how to do it (to deal with
vairous types of memories like mmap)?  Yeah, welcome to VM world.

AFAIK, in kernel 2.6, there is strictly overcommit mode
(/proc/sys/vm/overcommit_memory = 2) to prevent from getting sigkill
by OOM killer and so on.  It helps you that malloc() should be return
with the limit (see /proc/sys/vm/overcommit_ratio).

Rik van Riel and et al released various MM improvement patches.
Please check them if you keep to use kernel 2.4.

> malloc/arena.c:
> 
>   /* A memory region aligned to a multiple of HEAP_MAX_SIZE is needed.
>  No swap space needs to be reserved for the following large
>  mapping (on Linux, this is the case for all non-writable mappings
>  anyway). */
>   p1 = (char *)MMAP(0, HEAP_MAX_SIZE<<1, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE);
> ^

Use strace.


I don't think it's glibc bug.  It's linux specification.  I would like
to close this bug, ok?  If you want to discuss about this issue more,
I recommend you lists: linux-kernel/linux-mm/kernelnewbies that are
good place.

Regards,
-- gotom


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Processed: Re: Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-05 Thread Debian Bug Tracking System
Processing commands for [EMAIL PROTECTED]:

> severity 247300 normal
Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash
Severity set to `normal'.

> thanks
Stopping processing here.

Please contact me if you need assistance.

Debian bug tracking system administrator
(administrator, Debian Bugs database)


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-04 Thread Vincent Lefevre
Package: libc6
Version: 2.3.2.ds1-12
Severity: grave
Justification: causes non-serious data loss

I've set the overcommit to 0, and the malloc() function never fails,
even when there isn't enough memory left, making processes crash when
they need memory they have already allocated.

I'm not sure whether this is a libc6 or a kernel bug. I'm not familiar
with the glibc source, but could this be due to the following code?

malloc/arena.c:

  /* A memory region aligned to a multiple of HEAP_MAX_SIZE is needed.
 No swap space needs to be reserved for the following large
 mapping (on Linux, this is the case for all non-writable mappings
 anyway). */
  p1 = (char *)MMAP(0, HEAP_MAX_SIZE<<1, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE);
^

FYI, here's my test program:

/* $Id: malloc.c 3174 2004-04-28 14:44:41Z lefevre $
 *
 * malloc() test
 */

#include 
#include 
#include 

#define ONEMB 1048576

int main (void)
{
  char *p;
  int i;

  for (i = 1; (p = malloc(ONEMB)) != NULL; i++)
{
  printf ("Got %d MB\n", i);
  memset (p, 0, ONEMB);
}
  printf ("malloc() failed - OK\n");
  return 0;
}

-- System Information:
Debian Release: testing/unstable
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'testing')
Architecture: i386 (i686)
Kernel: Linux 2.4.26
Locale: LANG=POSIX, LC_CTYPE=en_US.ISO8859-1

Versions of packages libc6 depends on:
ii  libdb1-compat 2.1.3-7The Berkeley database routines [gl

-- no debconf information




Bug#247300: libc6: malloc() never fails on 2.4 kernels, making processes crash

2004-05-04 Thread Vincent Lefevre
Package: libc6
Version: 2.3.2.ds1-12
Severity: grave
Justification: causes non-serious data loss

I've set the overcommit to 0, and the malloc() function never fails,
even when there isn't enough memory left, making processes crash when
they need memory they have already allocated.

I'm not sure whether this is a libc6 or a kernel bug. I'm not familiar
with the glibc source, but could this be due to the following code?

malloc/arena.c:

  /* A memory region aligned to a multiple of HEAP_MAX_SIZE is needed.
 No swap space needs to be reserved for the following large
 mapping (on Linux, this is the case for all non-writable mappings
 anyway). */
  p1 = (char *)MMAP(0, HEAP_MAX_SIZE<<1, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE);
^

FYI, here's my test program:

/* $Id: malloc.c 3174 2004-04-28 14:44:41Z lefevre $
 *
 * malloc() test
 */

#include 
#include 
#include 

#define ONEMB 1048576

int main (void)
{
  char *p;
  int i;

  for (i = 1; (p = malloc(ONEMB)) != NULL; i++)
{
  printf ("Got %d MB\n", i);
  memset (p, 0, ONEMB);
}
  printf ("malloc() failed - OK\n");
  return 0;
}

-- System Information:
Debian Release: testing/unstable
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'testing')
Architecture: i386 (i686)
Kernel: Linux 2.4.26
Locale: LANG=POSIX, LC_CTYPE=en_US.ISO8859-1

Versions of packages libc6 depends on:
ii  libdb1-compat 2.1.3-7The Berkeley database routines [gl

-- no debconf information


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]