Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-11 Thread Andrea Arcangeli

On Wed, Oct 11, 2000 at 11:08:41AM +0200, Helge Hafting wrote:
> Nothing wrong with a big init - the problem is a memory-leaking init.
> That one will die anyway, wether it dies early from an OOM-killer
> or later when all other processes are gone don't really matter.

Indeed.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-11 Thread Helge Hafting

Andrea Arcangeli wrote:
> 
> On Tue, Oct 10, 2000 at 09:06:49AM +0200, Helge Hafting wrote:
> > If you want init to live - prove that it don't eat too much memory.
> 
> I don't see why the machine should be stable only if init is small.
> My kernel won't be stable only if init is small since it doesn't cost
> anything to handle correctly the big init case.
>
Nothing wrong with a big init - the problem is a memory-leaking init.
That one will die anyway, wether it dies early from an OOM-killer
or later when all other processes are gone don't really matter.

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 OOM handler

2000-10-11 Thread Helge Hafting

Andrea Arcangeli wrote:
 
 On Tue, Oct 10, 2000 at 09:06:49AM +0200, Helge Hafting wrote:
  If you want init to live - prove that it don't eat too much memory.
 
 I don't see why the machine should be stable only if init is small.
 My kernel won't be stable only if init is small since it doesn't cost
 anything to handle correctly the big init case.

Nothing wrong with a big init - the problem is a memory-leaking init.
That one will die anyway, wether it dies early from an OOM-killer
or later when all other processes are gone don't really matter.

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 OOM handler

2000-10-11 Thread Andrea Arcangeli

On Wed, Oct 11, 2000 at 11:08:41AM +0200, Helge Hafting wrote:
 Nothing wrong with a big init - the problem is a memory-leaking init.
 That one will die anyway, wether it dies early from an OOM-killer
 or later when all other processes are gone don't really matter.

Indeed.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-10 Thread Miles Lane

Olaf Titz wrote:
> 
> > > Still, it would be nice to recover that 4 MB when the system
> > > doesn't have any memory left.
> > Yup. The X server could give back the memory for some cases like the
> > background without too much hackery.
> 
> Then Linux only needs to implement SIGDANGER, which has been talked
> about for years...
> 
> X would be a good candidate to implement a handler for it. Others are
> Emacs, Mozilla or JVMs - basically everything which has a GC of some
> sort. It could even be used to implement a configurable user mode OOM
> killer.

It would be good to talk to the KDE and Gnome folks about this as well.
I am pretty sure they have large blocks of memory that could be flushed
or freed in a low-memory or OOM condition.

Miles
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-10 Thread Linus Torvalds



On Tue, 10 Oct 2000, Rogier Wolff wrote:
> 
> So if Netscape can "pump" 40 extra megabytes of memory out of X, this
> can be exploited. 
> 
> Now we're back to the point that a heuristic can never be right all
> the time..

I agree. In fact, we never left that.

Nothing is perfect.

In fact, a lot of engineering is _recognizing_ that you can never achieve
"perfect", and you're much better off not even trying - and having a
simple system that is "good enough".

This is the old adage of "perfect is the enemy of good" - trying too hard
is actually _detrimental_ in 99% of all cases. We should have simple
heuristics that work most of the time, instead of trying to cajole a
complex system like X to help us do some complicated resource management
system.

Complexity will just result in the OOM killer failing in surprising ways.

A simple heuristic will mean that the OOM killer will still fail, but at
least it won't be be in subtle and surprising ways.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-10 Thread Philipp Rumpf

On Tue, Oct 10, 2000 at 12:30:51PM -0300, Rik van Riel wrote:
> Not killing init when we "should" definately prevents
> embedded systems from auto-rebooting when they should
> do so.
> 
> (OTOH, I don't think embedded systems will run into
> this OOM issue too much)

but when they do, they're hard to fix.  Think about an elevator control
system with a single process that happens to implement a somewhat broken
version of the elevator algorithm ;)

> > that's what I said.  we need to be sure to _get_ a panic() though.
> 
> I believe the kernel automatically panic()s when init
> dies ... from kernel/exit.c::do_exit()
> 
> if (tsk->pid == 1)
> panic("Attempted to kill init!");

guess who added that code.  We still kill init with SIGTERM which doesn't
seem to work though.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-10 Thread Rik van Riel

On Tue, 10 Oct 2000, Philipp Rumpf wrote:
> On Tue, Oct 10, 2000 at 12:06:07PM -0300, Rik van Riel wrote:
> > On Tue, 10 Oct 2000, Philipp Rumpf wrote:
> > > > > The algorithm you posted on the list in this thread will kill
> > > > > init if on 4Mbyte machine without swap init is large 3 Mbytes
> > > > > and you execute a task that grows over 1M.
> > > > 
> > > > This sounds suspiciously like the description of a DEAD system ;)
> > > 
> > > But wouldn't a watchdog daemon which doesn't allocate any memory
> > > still get run ?
> > 
> > Indeed, it would. It would also /prevent/ the system
> > from automatically rebooting itself into a usable state ;)
> 
> So it's not dead in the "oh, it'll be back in 30 seconds" sense.  
> So our behaviour is broken (more so than random process
> killing).

*nod*

Not killing init when we "should" definately prevents
embedded systems from auto-rebooting when they should
do so.

(OTOH, I don't think embedded systems will run into
this OOM issue too much)

> > > You care about getting an automatic reboot.  So you need to be sure the
> > > watchdog daemon gets killed first or you panic() after some time.
> > 
> > echo 30 > /proc/sys/kernel/panic
> 
> that's what I said.  we need to be sure to _get_ a panic() though.

I believe the kernel automatically panic()s when init
dies ... from kernel/exit.c::do_exit()

if (tsk->pid == 1)
panic("Attempted to kill init!");

[which will make our system auto-reboot and be back on its feet
in a healty state again soon]

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-10 Thread Philipp Rumpf

On Tue, Oct 10, 2000 at 12:06:07PM -0300, Rik van Riel wrote:
> On Tue, 10 Oct 2000, Philipp Rumpf wrote:
> > > > The algorithm you posted on the list in this thread will kill
> > > > init if on 4Mbyte machine without swap init is large 3 Mbytes
> > > > and you execute a task that grows over 1M.
> > > 
> > > This sounds suspiciously like the description of a DEAD system ;)
> > 
> > But wouldn't a watchdog daemon which doesn't allocate any memory
> > still get run ?
> 
> Indeed, it would. It would also /prevent/ the system
> from automatically rebooting itself into a usable state ;)

So it's not dead in the "oh, it'll be back in 30 seconds" sense.  So our
behaviour is broken (more so than random process killing).

> > You care about getting an automatic reboot.  So you need to be sure the
> > watchdog daemon gets killed first or you panic() after some time.
> 
> echo 30 > /proc/sys/kernel/panic

that's what I said.  we need to be sure to _get_ a panic() though.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-10 Thread Rik van Riel

On Tue, 10 Oct 2000, Philipp Rumpf wrote:

> > > The algorithm you posted on the list in this thread will kill
> > > init if on 4Mbyte machine without swap init is large 3 Mbytes
> > > and you execute a task that grows over 1M.
> > 
> > This sounds suspiciously like the description of a DEAD system ;)
> 
> But wouldn't a watchdog daemon which doesn't allocate any memory
> still get run ?

Indeed, it would. It would also /prevent/ the system
from automatically rebooting itself into a usable state ;)

> > (in which case you simply don't care if init is being killed or not)
> 
> You care about getting an automatic reboot.  So you need to be sure the
> watchdog daemon gets killed first or you panic() after some time.

echo 30 > /proc/sys/kernel/panic

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-10 Thread Rogier Wolff

Linus Torvalds wrote:
> Basically, the only thing _I_ think X can do is to really say "oh, please
> don't count my memory, because everything I do I do for my clients, not
> for myself". 
> 
> THAT is my argument. Basically there is nothing we can reliably account.
> 
> So we might as well fall back on just saying "X is more important than
> some random client", and have a mm niceness level. Which right now is
> obviously approximated by the IO capabilities tests etc.

FYI:

I ran my machine out of memory (without crashing by the way) this
weekend by loading a whole bunch of large images into netscape. I
noticed not being able to open more windows when I saw my swapspace
exhausted. I noticed the large netscape, and killed it. 

At that moment my X was still taking 80Mb of RAM. I manually killed it
and restarted it to get rid of that memory. 

So if Netscape can "pump" 40 extra megabytes of memory out of X, this
can be exploited. 

Now we're back to the point that a heuristic can never be right all
the time..

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
*   Common sense is the collection of*
**  prejudices acquired by age eighteen.   -- Albert Einstein 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-10 Thread Andrea Arcangeli

On Tue, Oct 10, 2000 at 09:06:49AM +0200, Helge Hafting wrote:
> If you want init to live - prove that it don't eat too much memory.

I don't see why the machine should be stable only if init is small.
My kernel won't be stable only if init is small since it doesn't cost
anything to handle correctly the big init case.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-10 Thread Andrea Arcangeli

On Tue, Oct 10, 2000 at 04:38:02AM +0100, Philipp Rumpf wrote:
> Init should never die.  If we get to do_exit in init we'll panic which is
> the right thing to do (reboot on critical systems).

If the page fault can fail with OOM on init, init will get a SIGSEGV while
running a signal handler (copy-user will return -EFAULT regardless it was an
oom or a real segfault) and it _won't_ panic and the system is unusable.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-10 Thread Marco Colombo

On Mon, 9 Oct 2000, Linus Torvalds wrote:

> On Mon, 9 Oct 2000, Rik van Riel wrote:
> >
> > > I'd prefer just X having a higher "mm nice level" or something.
> > 
> > Which it has, because:
> > 
> > 1) CAP_RAW_IO
> > 2) p->euid == 0
> 
> Oh, I agree, but we might want to generalize this a bit so that root could
> say "this process is important" and then drop root privileges and still
> get "credited" for the fact that it's important.
> 
> It's not a big deal. It works for X right now.

How about using

p->rlim[RLIMIT_AS].rlim_cur

to weight the badness point for a process?
On my system, a 128MB RAM + 256MB swap, it defaults to some (insane?) value:

bash$ ulimit -vH -vS
virtual memory (kbytes)  4194302
virtual memory (kbytes)  2105343

for every process, which just means it is unused.

The idea is:
1) set default for rlim[RLIMIT_AS].rlim_max to a saner value;
2) processes with higher rlim[RLIMIT_AS].rlim_cur get lower badness.

This way, the badness of a process is not proportional to its absolute
size, but to the fraction of allowed AS it is using. Processes
that are capable(CAP_SYS_RESOURCE) can set RLIMIT_AS to a very high value,
so they get less badness point. X is a perfect candidate.

User's runaway processes (netscape) will have lower rlim[RLIMIT_AS].rlim_cur,
thus will get higher badness.

Something like:

-   points = p->mm->total_vm;
+   points = p->mm->total_vm / (p->rlim[RLIMIT_AS].rlim_cur << AS_FACTOR);

with

#define AS_FACTOR 30

maybe? (this is Rik's call, he knows better than me how to balance it...)

It's simple, it's configurable. 1) may be enforced by the kernel, or
completely left to user space.
On my system, in its default configuration (no use of RLIMIT_AS),
it has no impact at all (all processes have the same limit).

Sounds good or am I missing something?

> 
>   Linus
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
> 

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-10 Thread J.A. Sutherland

--On 09 October 2000, 17:40 -0300 Rik van Riel <[EMAIL PROTECTED]>
wrote:
> On Mon, 9 Oct 2000, James Sutherland wrote:
>> On Mon, 9 Oct 2000, Ingo Molnar wrote:
>> > On Mon, 9 Oct 2000, Rik van Riel wrote:
>> > 
>> > > > so dns helper is killed first, then netscape. (my idea might not
>> > > > make sense though.)
>> > > 
>> > > It makes some sense, but I don't think OOM is something that
>> > > occurs often enough to care about it /that/ much...
>> > 
>> > i'm trying to handle Andrea's case, the init=/bin/bash manual-bootup
>> > case, with 4MB RAM and no swap, where the admin tries to exec a 2MB
>> > process. I think it's a legitimate concern - i cannot know in advance
>> > whether a freshly started process would trigger an OOM or not.
>> 
>> Shouldn't the runtime factor handle this, making sure the new
>> process is killed? (Maybe not if you're almost OOM right from
>> the word go, and run this process straight off... Hrm.)
> 
> It should.
> 
> Also, the example is a tad unrealistic since init seems to be
> around 70 kB in size on my systems ;)

In extreme cases, though, you could arrange things so the
machine only has 100K of RAM when it loads init, at which
point init tries running, say, rc.sysinit - and everything goes 
bang. Of course, a machine like that won't be very much use
anyway...

More realistically, though, I could be running with something
like init=/bin/sash - does your statically linked sash binary
fit in 70K? :-)


James.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-10 Thread Jamie Lokier

Andreas Dilger wrote:
> Having a SIGDANGER handler is good for 2 reasons:
> 1) Lets processes know when memory is short so they can free needless cache.
> 2) Mark process with a SIGDANGER handler as "more important" than those
>without.  Most people won't care about this, but init, and X, and
>long-running simulations might.

For point 1, it would be much nicer to have user processes participate
in memory balancing _before_ getting anywhere near an OOM state.

A nice way is to send SIGDANGER with siginfo saying how much memory the
kernel wants back (or how fast).  Applications that don't know to use
that info, but do have a SIGDANGER handler, will still react just rather
more severely.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-10 Thread Jamie Lokier

Albert D. Cahalan wrote:
> X, and any other big friendly processes, could participate in
> memory balancing operations. X could be made to clean out a
> font cache when the kernel signals that memory is low. When
> the situation becomes serious, X could just mmap /dev/zero over
> top of the background image.

Haven't we already had this discussion?  Quite a lot of programs have
cached data (X fonts, Netscape (lots!)), GC-able data (Emacs, Java
etc.), data that can simply be discarded (X window backing stores), or
data that can be written to disk on demand (Netscape again).

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-10 Thread john slee

On Mon, Oct 09, 2000 at 06:34:29PM -0300, Rik van Riel wrote:
> On Mon, 9 Oct 2000, Ingo Molnar wrote:
> > On Mon, 9 Oct 2000, Rik van Riel wrote:
> > 
> > > Would this complexity /really/ be worth it for the twice-yearly OOM
> > > situation?
> > 
> > the only reason i suggested this was the init=/bin/bash, 4MB
> > RAM, no swap emergency-bootup case. We must not kill init in
> > that case - if the current code doesnt then great and none of
> > this is needed.

perhaps a boot time option oom=0 ?  since oom is such a rare case, this
wouldn't impact normal usage...

-- 
john slee <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-10 Thread Olaf Titz

> > Still, it would be nice to recover that 4 MB when the system
> > doesn't have any memory left.
> Yup. The X server could give back the memory for some cases like the
> background without too much hackery.

Then Linux only needs to implement SIGDANGER, which has been talked
about for years...

X would be a good candidate to implement a handler for it. Others are
Emacs, Mozilla or JVMs - basically everything which has a GC of some
sort. It could even be used to implement a configurable user mode OOM
killer.

Olaf

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-10 Thread Helge Hafting

Andrea Arcangeli wrote:
> 
> On Mon, Oct 09, 2000 at 08:42:26PM +0200, Ingo Molnar wrote:
> > ignoring the kill would just preserve those bugs artificially.
> 
> If the oom killer kills a thing like init by mistake or init has a memleak
> you'll notice both problems regardless of having a magic for init in a _very_
> slow path so I don't buy your point.
> .
> For corretness init must not be killed ever, period.
> 
> So you have two choices:
> 
> o   math proof that the current algorithm without the magic can't end
> killing init (and I should be able to proof the other way around
> instead)
> 
> o   have a magic check for init
> 
> So the magic is _strictly_ necessary at the moment.

A well-written init will be saved by being the oldest process around.
A memory-leaking init _will_ be killed even whith your magic test,
when the kernel eventually gets stuck OOM and init is the only
process left (all the other have been OOM-killed before.)  
A deadlocked kernel don't schedule any processes, so they are all dead.

If you want init to live - prove that it don't eat too much memory.

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 OOM handler

2000-10-10 Thread Helge Hafting

Andrea Arcangeli wrote:
 
 On Mon, Oct 09, 2000 at 08:42:26PM +0200, Ingo Molnar wrote:
  ignoring the kill would just preserve those bugs artificially.
 
 If the oom killer kills a thing like init by mistake or init has a memleak
 you'll notice both problems regardless of having a magic for init in a _very_
 slow path so I don't buy your point.
 .
 For corretness init must not be killed ever, period.
 
 So you have two choices:
 
 o   math proof that the current algorithm without the magic can't end
 killing init (and I should be able to proof the other way around
 instead)
 
 o   have a magic check for init
 
 So the magic is _strictly_ necessary at the moment.

A well-written init will be saved by being the oldest process around.
A memory-leaking init _will_ be killed even whith your magic test,
when the kernel eventually gets stuck OOM and init is the only
process left (all the other have been OOM-killed before.)  
A deadlocked kernel don't schedule any processes, so they are all dead.

If you want init to live - prove that it don't eat too much memory.

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 OOM handler

2000-10-10 Thread Jamie Lokier

Albert D. Cahalan wrote:
 X, and any other big friendly processes, could participate in
 memory balancing operations. X could be made to clean out a
 font cache when the kernel signals that memory is low. When
 the situation becomes serious, X could just mmap /dev/zero over
 top of the background image.

Haven't we already had this discussion?  Quite a lot of programs have
cached data (X fonts, Netscape (lots!)), GC-able data (Emacs, Java
etc.), data that can simply be discarded (X window backing stores), or
data that can be written to disk on demand (Netscape again).

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 OOM handler

2000-10-10 Thread Jamie Lokier

Andreas Dilger wrote:
 Having a SIGDANGER handler is good for 2 reasons:
 1) Lets processes know when memory is short so they can free needless cache.
 2) Mark process with a SIGDANGER handler as "more important" than those
without.  Most people won't care about this, but init, and X, and
long-running simulations might.

For point 1, it would be much nicer to have user processes participate
in memory balancing _before_ getting anywhere near an OOM state.

A nice way is to send SIGDANGER with siginfo saying how much memory the
kernel wants back (or how fast).  Applications that don't know to use
that info, but do have a SIGDANGER handler, will still react just rather
more severely.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 OOM handler

2000-10-10 Thread J.A. Sutherland

--On 09 October 2000, 17:40 -0300 Rik van Riel [EMAIL PROTECTED]
wrote:
 On Mon, 9 Oct 2000, James Sutherland wrote:
 On Mon, 9 Oct 2000, Ingo Molnar wrote:
  On Mon, 9 Oct 2000, Rik van Riel wrote:
  
so dns helper is killed first, then netscape. (my idea might not
make sense though.)
   
   It makes some sense, but I don't think OOM is something that
   occurs often enough to care about it /that/ much...
  
  i'm trying to handle Andrea's case, the init=/bin/bash manual-bootup
  case, with 4MB RAM and no swap, where the admin tries to exec a 2MB
  process. I think it's a legitimate concern - i cannot know in advance
  whether a freshly started process would trigger an OOM or not.
 
 Shouldn't the runtime factor handle this, making sure the new
 process is killed? (Maybe not if you're almost OOM right from
 the word go, and run this process straight off... Hrm.)
 
 It should.
 
 Also, the example is a tad unrealistic since init seems to be
 around 70 kB in size on my systems ;)

In extreme cases, though, you could arrange things so the
machine only has 100K of RAM when it loads init, at which
point init tries running, say, rc.sysinit - and everything goes 
bang. Of course, a machine like that won't be very much use
anyway...

More realistically, though, I could be running with something
like init=/bin/sash - does your statically linked sash binary
fit in 70K? :-)


James.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 OOM handler

2000-10-10 Thread Marco Colombo

On Mon, 9 Oct 2000, Linus Torvalds wrote:

 On Mon, 9 Oct 2000, Rik van Riel wrote:
 
   I'd prefer just X having a higher "mm nice level" or something.
  
  Which it has, because:
  
  1) CAP_RAW_IO
  2) p-euid == 0
 
 Oh, I agree, but we might want to generalize this a bit so that root could
 say "this process is important" and then drop root privileges and still
 get "credited" for the fact that it's important.
 
 It's not a big deal. It works for X right now.

How about using

p-rlim[RLIMIT_AS].rlim_cur

to weight the badness point for a process?
On my system, a 128MB RAM + 256MB swap, it defaults to some (insane?) value:

bash$ ulimit -vH -vS
virtual memory (kbytes)  4194302
virtual memory (kbytes)  2105343

for every process, which just means it is unused.

The idea is:
1) set default for rlim[RLIMIT_AS].rlim_max to a saner value;
2) processes with higher rlim[RLIMIT_AS].rlim_cur get lower badness.

This way, the badness of a process is not proportional to its absolute
size, but to the fraction of allowed AS it is using. Processes
that are capable(CAP_SYS_RESOURCE) can set RLIMIT_AS to a very high value,
so they get less badness point. X is a perfect candidate.

User's runaway processes (netscape) will have lower rlim[RLIMIT_AS].rlim_cur,
thus will get higher badness.

Something like:

-   points = p-mm-total_vm;
+   points = p-mm-total_vm / (p-rlim[RLIMIT_AS].rlim_cur  AS_FACTOR);

with

#define AS_FACTOR 30

maybe? (this is Rik's call, he knows better than me how to balance it...)

It's simple, it's configurable. 1) may be enforced by the kernel, or
completely left to user space.
On my system, in its default configuration (no use of RLIMIT_AS),
it has no impact at all (all processes have the same limit).

Sounds good or am I missing something?

 
   Linus
 
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/
 

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 OOM handler

2000-10-10 Thread Andrea Arcangeli

On Tue, Oct 10, 2000 at 04:38:02AM +0100, Philipp Rumpf wrote:
 Init should never die.  If we get to do_exit in init we'll panic which is
 the right thing to do (reboot on critical systems).

If the page fault can fail with OOM on init, init will get a SIGSEGV while
running a signal handler (copy-user will return -EFAULT regardless it was an
oom or a real segfault) and it _won't_ panic and the system is unusable.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 OOM handler

2000-10-10 Thread Andrea Arcangeli

On Tue, Oct 10, 2000 at 09:06:49AM +0200, Helge Hafting wrote:
 If you want init to live - prove that it don't eat too much memory.

I don't see why the machine should be stable only if init is small.
My kernel won't be stable only if init is small since it doesn't cost
anything to handle correctly the big init case.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 OOM handler

2000-10-10 Thread Rogier Wolff

Linus Torvalds wrote:
 Basically, the only thing _I_ think X can do is to really say "oh, please
 don't count my memory, because everything I do I do for my clients, not
 for myself". 
 
 THAT is my argument. Basically there is nothing we can reliably account.
 
 So we might as well fall back on just saying "X is more important than
 some random client", and have a mm niceness level. Which right now is
 obviously approximated by the IO capabilities tests etc.

FYI:

I ran my machine out of memory (without crashing by the way) this
weekend by loading a whole bunch of large images into netscape. I
noticed not being able to open more windows when I saw my swapspace
exhausted. I noticed the large netscape, and killed it. 

At that moment my X was still taking 80Mb of RAM. I manually killed it
and restarted it to get rid of that memory. 

So if Netscape can "pump" 40 extra megabytes of memory out of X, this
can be exploited. 

Now we're back to the point that a heuristic can never be right all
the time..

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
*   Common sense is the collection of*
**  prejudices acquired by age eighteen.   -- Albert Einstein 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 OOM handler

2000-10-10 Thread Rik van Riel

On Tue, 10 Oct 2000, Philipp Rumpf wrote:

   The algorithm you posted on the list in this thread will kill
   init if on 4Mbyte machine without swap init is large 3 Mbytes
   and you execute a task that grows over 1M.
  
  This sounds suspiciously like the description of a DEAD system ;)
 
 But wouldn't a watchdog daemon which doesn't allocate any memory
 still get run ?

Indeed, it would. It would also /prevent/ the system
from automatically rebooting itself into a usable state ;)

  (in which case you simply don't care if init is being killed or not)
 
 You care about getting an automatic reboot.  So you need to be sure the
 watchdog daemon gets killed first or you panic() after some time.

echo 30  /proc/sys/kernel/panic

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 OOM handler

2000-10-10 Thread Rik van Riel

On Tue, 10 Oct 2000, Philipp Rumpf wrote:
 On Tue, Oct 10, 2000 at 12:06:07PM -0300, Rik van Riel wrote:
  On Tue, 10 Oct 2000, Philipp Rumpf wrote:
 The algorithm you posted on the list in this thread will kill
 init if on 4Mbyte machine without swap init is large 3 Mbytes
 and you execute a task that grows over 1M.

This sounds suspiciously like the description of a DEAD system ;)
   
   But wouldn't a watchdog daemon which doesn't allocate any memory
   still get run ?
  
  Indeed, it would. It would also /prevent/ the system
  from automatically rebooting itself into a usable state ;)
 
 So it's not dead in the "oh, it'll be back in 30 seconds" sense.  
 So our behaviour is broken (more so than random process
 killing).

*nod*

Not killing init when we "should" definately prevents
embedded systems from auto-rebooting when they should
do so.

(OTOH, I don't think embedded systems will run into
this OOM issue too much)

   You care about getting an automatic reboot.  So you need to be sure the
   watchdog daemon gets killed first or you panic() after some time.
  
  echo 30  /proc/sys/kernel/panic
 
 that's what I said.  we need to be sure to _get_ a panic() though.

I believe the kernel automatically panic()s when init
dies ... from kernel/exit.c::do_exit()

if (tsk-pid == 1)
panic("Attempted to kill init!");

[which will make our system auto-reboot and be back on its feet
in a healty state again soon]

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 OOM handler

2000-10-10 Thread Philipp Rumpf

On Tue, Oct 10, 2000 at 12:30:51PM -0300, Rik van Riel wrote:
 Not killing init when we "should" definately prevents
 embedded systems from auto-rebooting when they should
 do so.
 
 (OTOH, I don't think embedded systems will run into
 this OOM issue too much)

but when they do, they're hard to fix.  Think about an elevator control
system with a single process that happens to implement a somewhat broken
version of the elevator algorithm ;)

  that's what I said.  we need to be sure to _get_ a panic() though.
 
 I believe the kernel automatically panic()s when init
 dies ... from kernel/exit.c::do_exit()
 
 if (tsk-pid == 1)
 panic("Attempted to kill init!");

guess who added that code.  We still kill init with SIGTERM which doesn't
seem to work though.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 OOM handler

2000-10-10 Thread Linus Torvalds



On Tue, 10 Oct 2000, Rogier Wolff wrote:
 
 So if Netscape can "pump" 40 extra megabytes of memory out of X, this
 can be exploited. 
 
 Now we're back to the point that a heuristic can never be right all
 the time..

I agree. In fact, we never left that.

Nothing is perfect.

In fact, a lot of engineering is _recognizing_ that you can never achieve
"perfect", and you're much better off not even trying - and having a
simple system that is "good enough".

This is the old adage of "perfect is the enemy of good" - trying too hard
is actually _detrimental_ in 99% of all cases. We should have simple
heuristics that work most of the time, instead of trying to cajole a
complex system like X to help us do some complicated resource management
system.

Complexity will just result in the OOM killer failing in surprising ways.

A simple heuristic will mean that the OOM killer will still fail, but at
least it won't be be in subtle and surprising ways.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 OOM handler

2000-10-10 Thread Miles Lane

Olaf Titz wrote:
 
   Still, it would be nice to recover that 4 MB when the system
   doesn't have any memory left.
  Yup. The X server could give back the memory for some cases like the
  background without too much hackery.
 
 Then Linux only needs to implement SIGDANGER, which has been talked
 about for years...
 
 X would be a good candidate to implement a handler for it. Others are
 Emacs, Mozilla or JVMs - basically everything which has a GC of some
 sort. It could even be used to implement a configurable user mode OOM
 killer.

It would be good to talk to the KDE and Gnome folks about this as well.
I am pretty sure they have large blocks of memory that could be flushed
or freed in a low-memory or OOM condition.

Miles
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread David Ford

Andreas Dilger wrote:

> Albert D. Cahalan wrote:
> > X, and any other big friendly processes, could participate in
> > memory balancing operations. X could be made to clean out a
>
> Gerrit Huizenga wrote:
> > Anyway, there is/was an API in PTX to say (either from in-kernel or through
> > some user machinations) "I Am a System Process".  Turns on a bit in the
>
> On AIX there is a signal called SIGDANGER, which is basically what you
> are looking for.  By default it is ignored, but for processes that care
> (e.g. init, X, whatever) they can register a SIGDANGER handler.  At an
> "urgent" (as oposed to "critical") OOM situation, all processes get a
> SIGDANGER sent to them.  Most will ignore it, but ones with handlers
> can free caches, try to do a clean shutdown, whatever.  Any process with
> a SIGDANGER handler get a reduction of "badness" (as the OOM killer calls
> it) when looking for processes to kill.
>
> Having a SIGDANGER handler is good for 2 reasons:
> 1) Lets processes know when memory is short so they can free needless cache.
> 2) Mark process with a SIGDANGER handler as "more important" than those
>without.  Most people won't care about this, but init, and X, and
>long-running simulations might.

Is there any reason why we can't do something like this for 2.5?

-d

--
  "There is a natural aristocracy among men. The grounds of this are
  virtue and talents", Thomas Jefferson [1742-1826], 3rd US President



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Andreas Dilger

> Rik van Riel wrote:
> > > How about SIGTERM a bit before SIGKILL then re-evaluate the OOM
> > > N usecs later?
> >
> > And run the risk of having to kill /another/ process as well ?
> >
> > I really don't know if that would be a wise thing to do
> > (but feel free to do some tests to see if your idea would
> > work ... I'd love to hear some test results with your idea).

David Ford writes:
> I was thinking (dangerous) about an urgent v.s. critical OOM.  urgent could
> trigger a SIGTERM which would give advance notice to the offending process.
> I don't think we have a signal method of notifying processes when resources
> are critically low, feel free to correct me.
> 
> Is there a signal that -might- be used for this?

Albert D. Cahalan wrote:
> X, and any other big friendly processes, could participate in
> memory balancing operations. X could be made to clean out a
> font cache when the kernel signals that memory is low. When
> the situation becomes serious, X could just mmap /dev/zero over
> top of the background image.
>
> Netscape could even be hacked to dump old junk... or if it is
> just too leaky, it could exec itself to fix the problem.

Gerrit Huizenga wrote:
> Anyway, there is/was an API in PTX to say (either from in-kernel or through
> some user machinations) "I Am a System Process".  Turns on a bit in the
> proc struct (task struct) that made it exempt from death from a variety
> of sources, e.g. OOM, generic user signals, portions of system shutdown,
> etc.
> 
> Then, the code looking for things to kill simply skips those that are
> intelligently marked, taking most of the decision making/policy making
> out of the scheduler/memory manager.

On AIX there is a signal called SIGDANGER, which is basically what you
are looking for.  By default it is ignored, but for processes that care
(e.g. init, X, whatever) they can register a SIGDANGER handler.  At an
"urgent" (as oposed to "critical") OOM situation, all processes get a
SIGDANGER sent to them.  Most will ignore it, but ones with handlers
can free caches, try to do a clean shutdown, whatever.  Any process with
a SIGDANGER handler get a reduction of "badness" (as the OOM killer calls
it) when looking for processes to kill.

Having a SIGDANGER handler is good for 2 reasons:
1) Lets processes know when memory is short so they can free needless cache.
2) Mark process with a SIGDANGER handler as "more important" than those
   without.  Most people won't care about this, but init, and X, and
   long-running simulations might.

Cheers, Andreas
-- 
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/   -- Dogbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Philipp Rumpf

> If init dies the kernel hangs solid anyway

Init should never die.  If we get to do_exit in init we'll panic which is
the right thing to do (reboot on critical systems).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Philipp Rumpf

> (but I'd be curious if somebody actually manages to
> trick the OOM killer into killing init ... please
> test a bit more to see if this really happens ;))

In a non-real-world situation, yes.  (mem=3500k, many drivers, init=/bin/bash,
tried to enter a command).  Since the process in question (bash) ignores
SIGTERM, I actually got a hard hang. 

We really should turn this into a panic() (panic means your elevator control
system reboots and maybe misses the right floor.  hard hang means you need
to reboot manually).


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Jim Gettys


"Albert D. Cahalan" <[EMAIL PROTECTED]> writes: 
> Date: Mon, 9 Oct 2000 19:13:25 -0400 (EDT)
>
> >> From: Linus Torvalds <[EMAIL PROTECTED]>
> 
> >> One of the biggest bitmaps is the background bitmap. So you have a
> >> client that uploads it to X and then goes away. There's nobody to
> >> un-count to by the time X decides to switch to another background.
> >
> > Actually, the big offenders are things other than the background
> > bitmap: things like E do absolutely insane things, you would not
> > believe (or maybe you would).  The background pixmap is generally
> > in the worst case typically no worse than 4 megabytes (for those
> > people who are crazy enough to put images up as their root window
> > on 32 bit deep displays, at 1kX1k resolution).
> 
> Still, it would be nice to recover that 4 MB when the system
> doesn't have any memory left.
> 

Yup. The X server could give back the memory for some cases like the
background without too much hackery.

> X, and any other big friendly processes, could participate in
> memory balancing operations. X could be made to clean out a
> font cache when the kernel signals that memory is low. When
> the situation becomes serious, X could just mmap /dev/zero over
> top of the background image.

I agree in principle, though the problem is difficult, as the memory pool 
may get fragmented... Most memory usage is less monolithic than the 
background pixmap.

And maintaining separate memory pools often wastes more memory than it
saves.

> 
> Netscape could even be hacked to dump old junk... or if it is
> just too leaky, it could exec itself to fix the problem.

Netscape 4.x is hopeless; it is leakier than the Titanic.  There is hope 
for Mozilla.
- Jim


--
Jim Gettys
Technology and Corporate Development
Compaq Computer Corporation
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Ingo Oeser

On Mon, Oct 09, 2000 at 04:07:32PM -0300, Rik van Riel wrote:
> > If the oom killer kills a thing like init by mistake
> That only happens in the "random" OOM killer 2.2 has ...

[OOM killer war]

Hi there,

before you argue endlessly about the "Right OOM Killer (TM)", I
did a small patch to allow replacing the OOM killer at runtime.

You can even use modules, if you are careful (see khttpd on how
to do this without refcouting).

So now you can stop arguing about the one and only OOM killer,
implement it, provide it as module and get back to the important
stuff ;-)

PS: Patch is against test9 with Rik's latest vmpatch applied.

Thanks for listening

Ingo Oeser

diff -Naur linux-2.4.0-test9-vmpatch/include/linux/swap.h 
linux-2.4.0-test9-vmpatch-ioe/include/linux/swap.h
--- linux-2.4.0-test9-vmpatch/include/linux/swap.h  Sun Oct  8 00:49:17 2000
+++ linux-2.4.0-test9-vmpatch-ioe/include/linux/swap.h  Tue Oct 10 00:50:17 2000
@@ -129,6 +129,9 @@
 /* linux/mm/oom_kill.c */
 extern int out_of_memory(void);
 extern void oom_kill(void);
+void install_oom_killer(void (*new_oom_kill)(void));
+void reset_default_oom_killer(void);
+
 
 /*
  * Make these inline later once they are working properly.
diff -Naur linux-2.4.0-test9-vmpatch/mm/Makefile 
linux-2.4.0-test9-vmpatch-ioe/mm/Makefile
--- linux-2.4.0-test9-vmpatch/mm/Makefile   Sun Oct  8 00:49:17 2000
+++ linux-2.4.0-test9-vmpatch-ioe/mm/Makefile   Tue Oct 10 00:10:07 2000
@@ -10,7 +10,8 @@
 O_TARGET := mm.o
 O_OBJS  := memory.o mmap.o filemap.o mprotect.o mlock.o mremap.o \
vmalloc.o slab.o bootmem.o swap.o vmscan.o page_io.o \
-   page_alloc.o swap_state.o swapfile.o numa.o oom_kill.o
+   page_alloc.o swap_state.o swapfile.o numa.o
+OX_OBJS  := oom_kill.o
 
 ifeq ($(CONFIG_HIGHMEM),y)
 O_OBJS += highmem.o
diff -Naur linux-2.4.0-test9-vmpatch/mm/oom_kill.c 
linux-2.4.0-test9-vmpatch-ioe/mm/oom_kill.c
--- linux-2.4.0-test9-vmpatch/mm/oom_kill.c Sun Oct  8 00:49:17 2000
+++ linux-2.4.0-test9-vmpatch-ioe/mm/oom_kill.c Tue Oct 10 00:35:32 2000
@@ -13,6 +13,8 @@
  *  machine) this file will double as a 'coding guide' and a signpost
  *  for newbie kernel hackers. It features several pointers to major
  *  kernel subsystems and hints as to where to find out what things do.
+ *
+ *  Added oom_killer API for special needs - Ingo Oeser
  */
 
 #include 
@@ -147,7 +149,9 @@
  * CAP_SYS_RAW_IO set, send SIGTERM instead (but it's unlikely that
  * we select a process with CAP_SYS_RAW_IO set).
  */
-void oom_kill(void)
+
+
+static void oom_kill_rik(void)
 {
 
struct task_struct *p = select_bad_process();
@@ -207,4 +211,26 @@
 
/* Else... */
return 1;
+}
+
+/* Protects oom_killer against resetting during its execution */
+static rwlock_t oom_kill_lock;
+
+static void (*oom_killer)(void)=oom_kill_rik;
+
+void oom_kill(void) {
+   read_lock(_kill_lock);
+   oom_killer();
+   read_unlock(_kill_lock);
+}
+
+void install_oom_killer(void (*new_oom_kill)(void)) {
+   if (!new_oom_kill) return;
+   write_lock(_kill_lock);
+   oom_killer=new_oom_kill;
+   write_unlock(_kill_lock);
+}
+
+void reset_default_oom_killer(void) {
+   install_oom_killer(_kill_rik);
 }

-- 
Feel the power of the penguin - run [EMAIL PROTECTED]
:x
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Mon, 9 Oct 2000, Albert D. Cahalan wrote:
> Jim Gettys writes:
> >> From: Linus Torvalds <[EMAIL PROTECTED]>
> 
> >> One of the biggest bitmaps is the background bitmap. So you have a
> >> client that uploads it to X and then goes away. There's nobody to
> >> un-count to by the time X decides to switch to another background.
> >
> > Actually, the big offenders are things other than the background
> > bitmap: things like E do absolutely insane things, you would not
> > believe (or maybe you would).  The background pixmap is generally
> > in the worst case typically no worse than 4 megabytes (for those
> > people who are crazy enough to put images up as their root window
> > on 32 bit deep displays, at 1kX1k resolution).
> 
> Still, it would be nice to recover that 4 MB when the system
> doesn't have any memory left.
> 
> X, and any other big friendly processes, could participate in
> memory balancing operations. X could be made to clean out a
> font cache when the kernel signals that memory is low. When
> the situation becomes serious, X could just mmap /dev/zero over
> top of the background image.
> 
> Netscape could even be hacked to dump old junk... or if it is
> just too leaky, it could exec itself to fix the problem.

Which is all good and well to DELAY the task of the OOM killer
for a few more minutes.

But in the end, there will be a point where you REALLY run out
of memory and you have no other choice than the OOM killer...

(not that I'm against alternative measures, I just think they're
orthagonal to this whole discussion)

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Albert D. Cahalan

Jim Gettys writes:
>> From: Linus Torvalds <[EMAIL PROTECTED]>

>> One of the biggest bitmaps is the background bitmap. So you have a
>> client that uploads it to X and then goes away. There's nobody to
>> un-count to by the time X decides to switch to another background.
>
> Actually, the big offenders are things other than the background
> bitmap: things like E do absolutely insane things, you would not
> believe (or maybe you would).  The background pixmap is generally
> in the worst case typically no worse than 4 megabytes (for those
> people who are crazy enough to put images up as their root window
> on 32 bit deep displays, at 1kX1k resolution).

Still, it would be nice to recover that 4 MB when the system
doesn't have any memory left.

X, and any other big friendly processes, could participate in
memory balancing operations. X could be made to clean out a
font cache when the kernel signals that memory is low. When
the situation becomes serious, X could just mmap /dev/zero over
top of the background image.

Netscape could even be hacked to dump old junk... or if it is
just too leaky, it could exec itself to fix the problem.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Tue, 10 Oct 2000, bert hubert wrote:
> On Mon, Oct 09, 2000 at 02:38:10PM -0700, Linus Torvalds wrote:
> 
> > So the process that gave X the bitmap dies. What now? Are we going to
> > depend on X un-counting the resources?
> > 
> > I'd prefer just X having a higher "mm nice level" or something.
> 
> I wonder how many megabytes we can fill with all messages about
> an OOM killer. I remember threads about this from '94 onwards.
> Perhaps we can finally have a sane one now :-)

In reality, the OOM killer I mailed a few days ago behaves
quite well in the real world.

I hope Linus will be as sensitive to theoretical arguments
with no foundation in reality as I am (ie. not), so we'll
have SOMETHING in the kernel soon.

If we later find out there are some problems with the OOM
killer, we can always change it then. No need to hold up
a reasonable solution when the current kernel has NO solution
to the problem at all ...

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Mon, 9 Oct 2000, Byron Stanoszek wrote:
> On Mon, 9 Oct 2000 [EMAIL PROTECTED] wrote:
> 
> > Anyway, there is/was an API in PTX to say (either from in-kernel or through
> > some user machinations) "I Am a System Process".  Turns on a bit in the
> > proc struct (task struct) that made it exempt from death from a variety
> > of sources, e.g. OOM, generic user signals, portions of system shutdown,
> > etc.
> 
> The current OOM killer does this, except for init. Checking to
> see if the process has a page table is equivalent to checking
> for the kernel threads that are integral to the system (PIDs
> 2-5). These will never be killed by the OOM. Init, however,
> still can be killed, and there should be an additional statement
> that doesn't kill if PID == 1.

Only if you can demonstrate any real-world scenario where 
init will be chosen with the current algorithm.

The "3 MB init on 4MB machine" kind of theoretical argument
just isn't convincing if nobody can show that there is a
problem in reality.

> I think we need to sit down and write a better OOM proposal,
> something that doesn't use CPU time and the NICE flag.

The nice flag has been removed from my current kernel tree.

The CPU time used, however, is a different matter. You really
don't want to have the OOM killer kill your 6-week-old running
simulation because a newly started netscape explodes ...

> How about we start by everyone in this discussion give their
> opinion on what the OOM selection process should do,

Quoting from mm/oom_kill.c:

/**
 * oom_badness - calculate a numeric value for how bad this task has been
 * @p: task struct of which task we should calculate
 *
 * The formula used is relatively simple and documented inline in the
 * function. The main rationale is that we want to select a good task
 * to kill when we run out of memory.
 *
 * Good in this context means that:
 * 1) we lose the minimum amount of work done
 * 2) we recover a large amount of memory
 * 3) we don't kill anything innocent of eating tons of memory
 * 4) we want to kill the minimum amount of processes (one)
 * 5) we try to kill the process the user expects us to kill, this
 *algorithm has been meticulously tuned to meet the priniciple
 *of least surprise ... (be careful when you change it)
 */

Do you have any additional requirements?

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread bert hubert

On Mon, Oct 09, 2000 at 02:38:10PM -0700, Linus Torvalds wrote:

> So the process that gave X the bitmap dies. What now? Are we going to
> depend on X un-counting the resources?
> 
> I'd prefer just X having a higher "mm nice level" or something.

I wonder how many megabytes we can fill with all messages about an OOM
killer. I remember threads about this from '94 onwards. Perhaps we can
finally have a sane one now :-)

Regards,

bert hubert

-- 
PowerDNS Versatile DNS Services  
Trilab   The Technology People   
'SYN! .. SYN|ACK! .. ACK!' - the mating call of the internet
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Byron Stanoszek

On Mon, 9 Oct 2000 [EMAIL PROTECTED] wrote:

> Anyway, there is/was an API in PTX to say (either from in-kernel or through
> some user machinations) "I Am a System Process".  Turns on a bit in the
> proc struct (task struct) that made it exempt from death from a variety
> of sources, e.g. OOM, generic user signals, portions of system shutdown,
> etc.

The current OOM killer does this, except for init. Checking to see if the
process has a page table is equivalent to checking for the kernel threads that
are integral to the system (PIDs 2-5). These will never be killed by the OOM.
Init, however, still can be killed, and there should be an additional statement
that doesn't kill if PID == 1.

I think we need to sit down and write a better OOM proposal, something that
doesn't use CPU time and the NICE flag. Lets concentrate our efforts on what
constitutes a good selection method instead of bickering with each other.

How about we start by everyone in this discussion give their opinion on what
the OOM selection process should do, listing them in both order of importance
and severity, giving a rational reason for each choice. Maybe then we can get
somewhere.

 -Byron

-- 
Byron Stanoszek Ph: (330) 644-3059
Systems Programmer  Fax: (330) 644-8110
Commercial Timesharing Inc. Email: [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Gerrit . Huizenga


At Sequent, we found that there are a small set of processes which are
"critical" to the system's operation in that they should not be killed
on swap shortage, memory shortage, etc.  This included things like init,
potentially inetd, the swapper, page daemon, clusters heartbeat daemon,
and generally any core system service which had a user process component.
If there wasn't enough memory for those processes, or if those processes
weren't already responsible in their use of memory/resources, you were
already toast.

Anyway, there is/was an API in PTX to say (either from in-kernel or through
some user machinations) "I Am a System Process".  Turns on a bit in the
proc struct (task struct) that made it exempt from death from a variety
of sources, e.g. OOM, generic user signals, portions of system shutdown,
etc.

Then, the code looking for things to kill simply skips those that are
intelligently marked, taking most of the decision making/policy making
out of the scheduler/memory manager.

gerrit

> On Mon, 9 Oct 2000, Linus Torvalds wrote:
> > On Mon, 9 Oct 2000, Andi Kleen wrote:
> > > 
> > > netscape usually has child processes: the dns helper. 
> > 
> > Yeah.
> > 
> > One thing we _can_ (and probably should do) is to do a per-user
> > memory pressure thing - we have easy access to the "struct
> > user_struct" (every process has a direct pointer to it), and it
> > should not be too bad to maintain a per-user "VM pressure"
> > counter.
> > 
> > Then, instead of trying to use heuristics like "does this
> > process have children" etc, you'd have things like "is this user
> > a nasty user", which is a much more valid thing to do and can be
> > used to find people who fork tons of processes that are
> > mid-sized but use a lot of memory due to just being many..
> 
> Sure we could do all of this, but does OOM really happen that
> often that we want to make the algorithm this complex ?
> 
> The current algorithm seems to work quite well and is already
> at the limit of how complex I'd like to see it. Having a less
> complex OOM killer turned out to not work very well, but having
> a more complex one is - IMHO - probably overkill ...
> 
> regards,
> 
> Rik
> --
> "What you're running that piece of shit Gnome?!?!"
>-- Miguel de Icaza, UKUUG 2000
> 
> http://www.conectiva.com/ http://www.surriel.com/
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [EMAIL PROTECTED]  For more info on Linux MM,
> see: http://www.linux.eu.org/Linux-MM/
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Jim Gettys


> From: Linus Torvalds <[EMAIL PROTECTED]>
> Date: Mon, 9 Oct 2000 14:50:51 -0700 (PDT)
> To: Jim Gettys <[EMAIL PROTECTED]>
> Cc: Alan Cox <[EMAIL PROTECTED]>, Andi Kleen <[EMAIL PROTECTED]>,
> Ingo Molnar <[EMAIL PROTECTED]>, Andrea Arcangeli <[EMAIL PROTECTED]>,
> Rik van Riel <[EMAIL PROTECTED]>,
> Byron Stanoszek <[EMAIL PROTECTED]>,
>         MM mailing list <[EMAIL PROTECTED]>, [EMAIL PROTECTED]
> Subject: Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler
> -
> On Mon, 9 Oct 2000, Jim Gettys wrote:
> >
> >
> > On Date: Mon, 9 Oct 2000 14:38:10 -0700 (PDT), Linus Torvalds
> <[EMAIL PROTECTED]>
> > said:
> >
> > >
> > > The problem is that there is no way to keep track of them afterwards.
> > >
> > > So the process that gave X the bitmap dies. What now? Are we going to
> > > depend on X un-counting the resources?
> > >
> >
> > X has to uncount the resources already, to free the memory in the X server
> > allocated on behalf of that client.  X has to get this right, to be a long
> > lived server (properly debugged X servers last many months without problems:
> > unfortunately, a fair number of DDX's are buggy).
> 
> No, but my point is that it doesn't really work.
> 
> One of the biggest bitmaps is the background bitmap. So you have a client
> that uploads it to X and then goes away. There's nobody to un-count to by
> the time X decides to switch to another background.

Actually, the big offenders are things other than the background bitmap:
things like E do absolutely insane things, you would not believe (or maybe
you would).  The background pixmap is generally in the worst case typically
no worse than 4 megabytes (for those people who are crazy enough to put
images up as their root window on 32 bit deep displays, at 1kX1k resolution).

> 
> Does that memory just disappear as far as the resource handling is
> concerned when the client that originated it dies?

No, X recovers the memory when a connection dies, unless the client has
gone out of its way to arrange to preserve things across connection
termination.  Few, if any clients do this: it is primarily possible mostly
for debugging purposes, that (fortunately, or unfortunately, depending
on your opinion) what happens not just vanish before you can see what
happened.

So the X server does extensive bookkeeping of its memory usage, and retrieves
all memory used by clients when they terminate (with the above rare
exception).

> 
> What happens with TCP connections? They might be local. Or they might not.
> In either case X doesn't know whom to blame.

At least on BSD kernels, it was reasonably straightforward to determine
if a TCP connection was local: in that case, the code actually did an upcall
and delivered data directly to the appropriate socket.  Dunno about the
insides of Linux.

I suspect it should not be hard to find the right process for local
connections.  Distant connections are, indeed, a challenge.

> 
> Basically, the only thing _I_ think X can do is to really say "oh, please
> don't count my memory, because everything I do I do for my clients, not
> for myself".
> 
> THAT is my argument. Basically there is nothing we can reliably account.

Your argument has alot of validity, though the X server does a better job
of accounting than you might think.

BUT, I'm actually more interested in dealing with scheduling preferences, to
get really first rate interactive feel.

> 
> So we might as well fall back on just saying "X is more important than
> some random client", and have a mm niceness level. Which right now is
> obviously approximated by the IO capabilities tests etc.
> 

As I say above, the principle here may be more useful than for the memory 
example, but for controlling scheduling so we can get great interactive 
feel.  THAT is what is really worth discussing.
- Jim


--
Jim Gettys
Technology and Corporate Development
Compaq Computer Corporation
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Mon, 9 Oct 2000, Aaron Sethman wrote:

> I think the run time should probably be accounted into to this
> as well. Basically start knocking off recent processes first,
> which are likely to be childless, and start working your way up
> in age.

I'm almost getting USENET flashbacks ...  ;)

Please look at the code before suggesting something that
is already there (and has been in the code for some 2 years).

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Alan Cox

> > across AF_UNIX sockets so the mechanism is notionally there to provide the 
> > credentials to X, just not to use them
> 
> The problem is that there is no way to keep track of them afterwards.

If you use mmap for your allocator then beancounter will get it right. Every
resource knows which beancounter it was charged too. It adds an overhead the
average desktop user won't like but which is pretty much essential to do real
mainframe world operation. So it would become

seteuid(Client->passed_euid);
mmap(buffer in pages)
seteuid(getuid());

With lightwait counting semantics its hard to make any tracking system work
well in the corner cases like resources that survive process death.

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Aaron Sethman

On Mon, 9 Oct 2000, James Sutherland wrote:

> On Mon, 9 Oct 2000, Ingo Molnar wrote:
> 
> > On Mon, 9 Oct 2000, Rik van Riel wrote:
> > 
> > > > so dns helper is killed first, then netscape. (my idea might not
> > > > make sense though.)
> > > 
> > > It makes some sense, but I don't think OOM is something that
> > > occurs often enough to care about it /that/ much...
> > 
> > i'm trying to handle Andrea's case, the init=/bin/bash manual-bootup case,
> > with 4MB RAM and no swap, where the admin tries to exec a 2MB process. I
> > think it's a legitimate concern - i cannot know in advance whether a
> > freshly started process would trigger an OOM or not.
> 
> Shouldn't the runtime factor handle this, making sure the new process is
> killed? (Maybe not if you're almost OOM right from the word go, and run
> this process straight off... Hrm.)

I think the run time should probably be accounted into to this as
well. Basically start knocking off recent processes first, which are
likely to be childless, and start working your way up in age. The
reasoning here is that your less likely an important, long running
service.  Of course you could probably account for whether the process is
childless or not as well. 

Just my $0.02 on it..


Aaron

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Linus Torvalds



On Mon, 9 Oct 2000, Jim Gettys wrote:
> 
> 
> On Date: Mon, 9 Oct 2000 14:38:10 -0700 (PDT), Linus Torvalds 
><[EMAIL PROTECTED]>
> said:
> 
> >
> > The problem is that there is no way to keep track of them afterwards.
> >
> > So the process that gave X the bitmap dies. What now? Are we going to
> > depend on X un-counting the resources?
> >
> 
> X has to uncount the resources already, to free the memory in the X server
> allocated on behalf of that client.  X has to get this right, to be a long
> lived server (properly debugged X servers last many months without problems:
> unfortunately, a fair number of DDX's are buggy).

No, but my point is that it doesn't really work.

One of the biggest bitmaps is the background bitmap. So you have a client
that uploads it to X and then goes away. There's nobody to un-count to by
the time X decides to switch to another background.

Does that memory just disappear as far as the resource handling is
concerned when the client that originated it dies?

What happens with TCP connections? They might be local. Or they might not.
In either case X doesn't know whom to blame.

Basically, the only thing _I_ think X can do is to really say "oh, please
don't count my memory, because everything I do I do for my clients, not
for myself". 

THAT is my argument. Basically there is nothing we can reliably account.

So we might as well fall back on just saying "X is more important than
some random client", and have a mm niceness level. Which right now is
obviously approximated by the IO capabilities tests etc.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Jim Gettys



On Date: Mon, 9 Oct 2000 14:38:10 -0700 (PDT), Linus Torvalds <[EMAIL PROTECTED]>
said:

> 
> The problem is that there is no way to keep track of them afterwards.
> 
> So the process that gave X the bitmap dies. What now? Are we going to
> depend on X un-counting the resources?
> 

X has to uncount the resources already, to free the memory in the X server
allocated on behalf of that client.  X has to get this right, to be a long
lived server (properly debugged X servers last many months without problems:
unfortunately, a fair number of DDX's are buggy).

- Jim

--
Jim Gettys
Technology and Corporate Development
Compaq Computer Corporation
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Linus Torvalds



On Mon, 9 Oct 2000, Rik van Riel wrote:
>
> > I'd prefer just X having a higher "mm nice level" or something.
> 
> Which it has, because:
> 
> 1) CAP_RAW_IO
> 2) p->euid == 0

Oh, I agree, but we might want to generalize this a bit so that root could
say "this process is important" and then drop root privileges and still
get "credited" for the fact that it's important.

It's not a big deal. It works for X right now.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Jim Gettys


> > Sounds like one needs in addition some mechanism for servers to "charge"
> clients for
> > consumption. X certainly knows on behalf of which connection resources
> > are created; the OS could then transfer this back to the appropriate client
> > (at least when on machine).
> 
> Definitely - and this is present in some non Unix OS's. We do pass credentials
> across AF_UNIX sockets so the mechanism is notionally there to provide the
> credentials to X, just not to use them

Stephen Tweedie, Dave Rosenthal, Keith Packard and myself had an extensive
discussion on similar ideas around process quantum scheduling (the X server
would like to be able to forward quantum to clients) as well at Usenix.
This is closely related, and needed to finally fully control interactive
feel in the face of "greedy" clients.

My memory is that it sounded like things could become very interesting
with such a facility, and might be ripe for 2.5.

Keith, Stephen, Dave, do you remember the details of our discussion?
- Jim

--
Jim Gettys
Technology and Corporate Development
Compaq Computer Corporation
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Mon, 9 Oct 2000, Linus Torvalds wrote:
> On Mon, 9 Oct 2000, Alan Cox wrote:
> > > consumption. X certainly knows on behalf of which connection resources
> > > are created; the OS could then transfer this back to the appropriate client
> > > (at least when on machine).
> > 
> > Definitely - and this is present in some non Unix OS's. We do pass credentials
> > across AF_UNIX sockets so the mechanism is notionally there to provide the 
> > credentials to X, just not to use them
> 
> The problem is that there is no way to keep track of them afterwards.
> 
> So the process that gave X the bitmap dies. What now? Are we going to
> depend on X un-counting the resources?
> 
> I'd prefer just X having a higher "mm nice level" or something.

Which it has, because:

1) CAP_RAW_IO
2) p->euid == 0

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Linus Torvalds



On Mon, 9 Oct 2000, Alan Cox wrote:
> > consumption. X certainly knows on behalf of which connection resources
> > are created; the OS could then transfer this back to the appropriate client
> > (at least when on machine).
> 
> Definitely - and this is present in some non Unix OS's. We do pass credentials
> across AF_UNIX sockets so the mechanism is notionally there to provide the 
> credentials to X, just not to use them

The problem is that there is no way to keep track of them afterwards.

So the process that gave X the bitmap dies. What now? Are we going to
depend on X un-counting the resources?

I'd prefer just X having a higher "mm nice level" or something.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Mon, 9 Oct 2000, Ingo Molnar wrote:
> On Mon, 9 Oct 2000, Rik van Riel wrote:
> 
> > Would this complexity /really/ be worth it for the twice-yearly OOM
> > situation?
> 
> the only reason i suggested this was the init=/bin/bash, 4MB
> RAM, no swap emergency-bootup case. We must not kill init in
> that case - if the current code doesnt then great and none of
> this is needed.

I guess this requires some testing. If anybody can reproduce
the bad effects without going /too/ much out of the way of a
realistic scenario, the code needs to be fixed.

If it turns out to be a non-issue in all scenarios, there's
no need to make the code any more complex.

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Andi Kleen

On Mon, Oct 09, 2000 at 10:28:38PM +0100, Alan Cox wrote:
> > Sounds like one needs in addition some mechanism for servers to "charge" clients 
>for
> > consumption. X certainly knows on behalf of which connection resources
> > are created; the OS could then transfer this back to the appropriate client
> > (at least when on machine).
> 
> Definitely - and this is present in some non Unix OS's. We do pass credentials
> across AF_UNIX sockets so the mechanism is notionally there to provide the 
> credentials to X, just not to use them

X can get the pid using SO_PEERCRED for unix connections. 

When the oom killer maintains some kind of badness value in the task_struct
it would be possible to add a charge() systemcall that manipulates it.

int charge(pid_t pid, int memorytobecharged) 


-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Paul Jakma

On Mon, 9 Oct 2000, David Ford wrote:

> Not if "init" is a particular program running on a router floppy for
> example.  The system may be designed to be a router and the userland
> monitor/control program is the only thing that runs and consumes 90% of the
> memory.  If a forked or spawned process starts up with high CPU that just
> tips it over the OOM edge, we don't really want to kill init even if it's
> taking "all" the memory and or "all" the cpu.

this is such a special case it is not worth considering - rather
leave it up to the designer of the router floppy to get his stuff
right.

the one thing that is clear from the many OOM flamewars is that no
OOM reaper algorithm will satisfy 100% of conditions 100% of the
time. So all Rik can do is optimise for the common case.

(roll on beancounting and proper resource limiting - the true but
heavyweight solution)

regards,
-- 
Paul Jakma  [EMAIL PROTECTED]
PGP5 key: http://www.clubi.ie/jakma/publickey.txt
---
Fortune:
Individualists unite!


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Ingo Molnar


On Mon, 9 Oct 2000, Rik van Riel wrote:

> Would this complexity /really/ be worth it for the twice-yearly OOM
> situation?

the only reason i suggested this was the init=/bin/bash, 4MB RAM, no swap
emergency-bootup case. We must not kill init in that case - if the current
code doesnt then great and none of this is needed.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Alan Cox

> Sounds like one needs in addition some mechanism for servers to "charge" clients for
> consumption. X certainly knows on behalf of which connection resources
> are created; the OS could then transfer this back to the appropriate client
> (at least when on machine).

Definitely - and this is present in some non Unix OS's. We do pass credentials
across AF_UNIX sockets so the mechanism is notionally there to provide the 
credentials to X, just not to use them
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Mon, 9 Oct 2000, Ingo Molnar wrote:
> On Mon, 9 Oct 2000, Alan Cox wrote:
> 
> > Lets kill a 6 week long typical background compute job because
> > netscape exploded (and yes netscape has a child process)
> 
> in the paragraph you didnt quote i pointed this out and
> suggested adding all parent's badness value to children as well
> - so we'd end up killing netscape.

Would this complexity /really/ be worth it for the twice-yearly
OOM situation?

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Jim Gettys


> Sender: [EMAIL PROTECTED]
> From: "Andi Kleen" <[EMAIL PROTECTED]>
> Date: Mon, 9 Oct 2000 22:58:22 +0200
> To: Linus Torvalds <[EMAIL PROTECTED]>
> Cc: Andi Kleen <[EMAIL PROTECTED]>, Ingo Molnar <[EMAIL PROTECTED]>,
> Andrea Arcangeli <[EMAIL PROTECTED]>,
> Rik van Riel <[EMAIL PROTECTED]>,
> Byron Stanoszek <[EMAIL PROTECTED]>,
>     MM mailing list <[EMAIL PROTECTED]>, [EMAIL PROTECTED]
> Subject: Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler
> -
> On Mon, Oct 09, 2000 at 01:52:21PM -0700, Linus Torvalds wrote:
> > One thing we _can_ (and probably should do) is to do a per-user memory
> > pressure thing - we have easy access to the "struct user_struct" (every
> > process has a direct pointer to it), and it should not be too bad to
> > maintain a per-user "VM pressure" counter.
> >
> > Then, instead of trying to use heuristics like "does this process have
> > children" etc, you'd have things like "is this user a nasty user", which
> > is a much more valid thing to do and can be used to find people who fork
> > tons of processes that are mid-sized but use a lot of memory due to just
> > being many..
> 
> Would not help much when "they" eat your memory by loading big bitmaps
> into the X server which runs as root (it seems there are many programs
> which are very good at this particular DOS ;)
> 

This is generic to any server program, not unique to X.

Sounds like one needs in addition some mechanism for servers to "charge" clients for
consumption. X certainly knows on behalf of which connection resources
are created; the OS could then transfer this back to the appropriate client
(at least when on machine).

- Jim

--
Jim Gettys
Technology and Corporate Development
Compaq Computer Corporation
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Ingo Molnar


On Mon, 9 Oct 2000, Alan Cox wrote:

> Lets kill a 6 week long typical background compute job because
> netscape exploded (and yes netscape has a child process)

in the paragraph you didnt quote i pointed this out and suggested adding
all parent's badness value to children as well - so we'd end up killing
netscape.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Alan Cox

> i think the OOM algorithm should not kill processes that have
> child-processes, it should first kill child-less 'leaves'. Killing a
> process that has child processes likely results in unexpected behavior of
> those child-processes. (and equals to effective killing of those
> child-processes as well.)

Lets kill a 6 week long typical background compute job because netscape exploded
(and yes netscape has a child process)

Rik's current OOM killer works very well but its a heuristic, so like all
heuristics you can always find a problem case

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Alan Cox

> Then spam the console loudly with printk, but don't destroy the whole machine.
> Init should only get killed if it REALLY is taking a lot of memory.  On a 4 or 8meg

If init dies the kernel hangs solid anyway

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Mon, 9 Oct 2000, Linus Torvalds wrote:
> On Mon, 9 Oct 2000, Andi Kleen wrote:
> > 
> > netscape usually has child processes: the dns helper. 
> 
> Yeah.
> 
> One thing we _can_ (and probably should do) is to do a per-user
> memory pressure thing - we have easy access to the "struct
> user_struct" (every process has a direct pointer to it), and it
> should not be too bad to maintain a per-user "VM pressure"
> counter.
> 
> Then, instead of trying to use heuristics like "does this
> process have children" etc, you'd have things like "is this user
> a nasty user", which is a much more valid thing to do and can be
> used to find people who fork tons of processes that are
> mid-sized but use a lot of memory due to just being many..

Sure we could do all of this, but does OOM really happen that
often that we want to make the algorithm this complex ?

The current algorithm seems to work quite well and is already
at the limit of how complex I'd like to see it. Having a less
complex OOM killer turned out to not work very well, but having
a more complex one is - IMHO - probably overkill ...

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Andi Kleen

On Mon, Oct 09, 2000 at 01:52:21PM -0700, Linus Torvalds wrote:
> One thing we _can_ (and probably should do) is to do a per-user memory
> pressure thing - we have easy access to the "struct user_struct" (every
> process has a direct pointer to it), and it should not be too bad to
> maintain a per-user "VM pressure" counter.
> 
> Then, instead of trying to use heuristics like "does this process have
> children" etc, you'd have things like "is this user a nasty user", which
> is a much more valid thing to do and can be used to find people who fork
> tons of processes that are mid-sized but use a lot of memory due to just
> being many..

Would not help much when "they" eat your memory by loading big bitmaps
into the X server which runs as root (it seems there are many programs
which are very good at this particular DOS ;) 

Also I think most oom situations are accidents anyways, not malicious users.
When you're the only user of the machine sophisticated per user accouting
won't be very useful. 

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Linus Torvalds



On Mon, 9 Oct 2000, Andi Kleen wrote:
> 
> netscape usually has child processes: the dns helper. 

Yeah.

One thing we _can_ (and probably should do) is to do a per-user memory
pressure thing - we have easy access to the "struct user_struct" (every
process has a direct pointer to it), and it should not be too bad to
maintain a per-user "VM pressure" counter.

Then, instead of trying to use heuristics like "does this process have
children" etc, you'd have things like "is this user a nasty user", which
is a much more valid thing to do and can be used to find people who fork
tons of processes that are mid-sized but use a lot of memory due to just
being many..

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Ingo Molnar


On Mon, 9 Oct 2000, Linus Torvalds wrote:

> I disagree - if we start adding these kinds of heuristics to it, it
> wil just be a way for people to try to confuse the OOM code. Imagine
> some bad guy that does 15 fork()'s and then tries to OOM...

yep.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Mon, 9 Oct 2000, Linus Torvalds wrote:
> On Mon, 9 Oct 2000, Ingo Molnar wrote:
> > 
> > i think the OOM algorithm should not kill processes that have
> > child-processes, it should first kill child-less 'leaves'. Killing a
> > process that has child processes likely results in unexpected behavior of
> > those child-processes. (and equals to effective killing of those
> > child-processes as well.)
> 
> I disagree - if we start adding these kinds of heuristics to it,
> it wil just be a way for people to try to confuse the OOM code.
> Imagine some bad guy that does 15 fork()'s and then tries to
> OOM...

Also, the only way to prevent bad things like this is userbeans,
the per-user resource quotas; until we have that there will ALWAYS
be ways to fool the OOM killer. It is just a stop-gap measure to
recover from a very bad situation...

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread David Ford

Rik van Riel wrote:

> > How about SIGTERM a bit before SIGKILL then re-evaluate the OOM
> > N usecs later?
>
> And run the risk of having to kill /another/ process as well ?
>
> I really don't know if that would be a wise thing to do
> (but feel free to do some tests to see if your idea would
> work ... I'd love to hear some test results with your idea).

I was thinking (dangerous) about an urgent v.s. critical OOM.  urgent could
trigger a SIGTERM which would give advance notice to the offending process.
I don't think we have a signal method of notifying processes when resources
are critically low, feel free to correct me.

Is there a signal that -might- be used for this?

-d

--
  "There is a natural aristocracy among men. The grounds of this are
  virtue and talents", Thomas Jefferson [1742-1826], 3rd US President



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Andrea Arcangeli

On Mon, Oct 09, 2000 at 09:38:08PM +0100, James Sutherland wrote:
> Shouldn't the runtime factor handle this, making sure the new process is

The runtime factor in the algorithm will make the first difference
only after lots lots of time (and the run_time can as well be wrong
because of jiffies wrap around). But even if it would make a difference
after 1 second, there would be a 1 second window where init can be killed.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Mon, 9 Oct 2000, James Sutherland wrote:
> On Mon, 9 Oct 2000, Ingo Molnar wrote:
> > On Mon, 9 Oct 2000, Rik van Riel wrote:
> > 
> > > > so dns helper is killed first, then netscape. (my idea might not
> > > > make sense though.)
> > > 
> > > It makes some sense, but I don't think OOM is something that
> > > occurs often enough to care about it /that/ much...
> > 
> > i'm trying to handle Andrea's case, the init=/bin/bash manual-bootup case,
> > with 4MB RAM and no swap, where the admin tries to exec a 2MB process. I
> > think it's a legitimate concern - i cannot know in advance whether a
> > freshly started process would trigger an OOM or not.
> 
> Shouldn't the runtime factor handle this, making sure the new
> process is killed? (Maybe not if you're almost OOM right from
> the word go, and run this process straight off... Hrm.)

It should.

Also, the example is a tad unrealistic since init seems to be
around 70 kB in size on my systems ;)

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Linus Torvalds



On Mon, 9 Oct 2000, Ingo Molnar wrote:
> 
> i think the OOM algorithm should not kill processes that have
> child-processes, it should first kill child-less 'leaves'. Killing a
> process that has child processes likely results in unexpected behavior of
> those child-processes. (and equals to effective killing of those
> child-processes as well.)

I disagree - if we start adding these kinds of heuristics to it, it wil
just be a way for people to try to confuse the OOM code. Imagine some bad
guy that does 15 fork()'s and then tries to OOM...

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread James Sutherland

On Mon, 9 Oct 2000, Ingo Molnar wrote:

> On Mon, 9 Oct 2000, Rik van Riel wrote:
> 
> > > so dns helper is killed first, then netscape. (my idea might not
> > > make sense though.)
> > 
> > It makes some sense, but I don't think OOM is something that
> > occurs often enough to care about it /that/ much...
> 
> i'm trying to handle Andrea's case, the init=/bin/bash manual-bootup case,
> with 4MB RAM and no swap, where the admin tries to exec a 2MB process. I
> think it's a legitimate concern - i cannot know in advance whether a
> freshly started process would trigger an OOM or not.

Shouldn't the runtime factor handle this, making sure the new process is
killed? (Maybe not if you're almost OOM right from the word go, and run
this process straight off... Hrm.)


James.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Mon, 9 Oct 2000, David Ford wrote:
> Ingo Molnar wrote:
> 
> > > a good idea to have SIGKILL delivery speeded up for every SIGKILL ...
> >
> > yep.
> 
> How about SIGTERM a bit before SIGKILL then re-evaluate the OOM
> N usecs later?

And run the risk of having to kill /another/ process as well ?

I really don't know if that would be a wise thing to do
(but feel free to do some tests to see if your idea would
work ... I'd love to hear some test results with your idea).

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread David Ford

Ingo Molnar wrote:

> > a good idea to have SIGKILL delivery speeded up for every SIGKILL ...
>
> yep.

How about SIGTERM a bit before SIGKILL then re-evaluate the OOM N usecs
later?

-d

--
  "There is a natural aristocracy among men. The grounds of this are
  virtue and talents", Thomas Jefferson [1742-1826], 3rd US President



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread David Ford

Rik van Riel wrote:

> On Mon, 9 Oct 2000, Andrea Arcangeli wrote:
> > On Mon, Oct 09, 2000 at 04:07:32PM -0300, Rik van Riel wrote:
> > > No. It's only needed if your OOM algorithm is so crappy that
> > > it might end up killing init by mistake.
> >
> > The algorithm you posted on the list in this thread will kill
> > init if on 4Mbyte machine without swap init is large 3 Mbytes
> > and you execute a task that grows over 1M.
>
> This sounds suspiciously like the description of a DEAD system ;)
>
> (in which case you simply don't care if init is being killed or not)

Not if "init" is a particular program running on a router floppy for
example.  The system may be designed to be a router and the userland
monitor/control program is the only thing that runs and consumes 90% of the
memory.  If a forked or spawned process starts up with high CPU that just
tips it over the OOM edge, we don't really want to kill init even if it's
taking "all" the memory and or "all" the cpu.


> --
>   "There is a natural aristocracy among men. The grounds of this are
>   virtue and talents", Thomas Jefferson [1742-1826], 3rd US President

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Andrea Arcangeli

On Mon, Oct 09, 2000 at 05:06:48PM -0300, Rik van Riel wrote:
> On Mon, 9 Oct 2000, Andrea Arcangeli wrote:
> > On Mon, Oct 09, 2000 at 04:07:32PM -0300, Rik van Riel wrote:
> > > No. It's only needed if your OOM algorithm is so crappy that
> > > it might end up killing init by mistake.
> > 
> > The algorithm you posted on the list in this thread will kill
> > init if on 4Mbyte machine without swap init is large 3 Mbytes
> > and you execute a task that grows over 1M.
> 
> This sounds suspiciously like the description of a DEAD system ;)

The system will be DEAD only when your current algorithm will kill init.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Mon, 9 Oct 2000, Ingo Molnar wrote:
> On Mon, 9 Oct 2000, Rik van Riel wrote:
> 
> > > so dns helper is killed first, then netscape. (my idea might not
> > > make sense though.)
> > 
> > It makes some sense, but I don't think OOM is something that
> > occurs often enough to care about it /that/ much...
> 
> i'm trying to handle Andrea's case, the init=/bin/bash
> manual-bootup case, with 4MB RAM and no swap, where the admin
> tries to exec a 2MB process. I think it's a legitimate concern -
> i cannot know in advance whether a freshly started process would
> trigger an OOM or not.

In that case the time running and the cpu time used
factors should give the new process a heavy penalty
compared to init.

(but I'd be curious if somebody actually manages to
trick the OOM killer into killing init ... please
test a bit more to see if this really happens ;))

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Mon, 9 Oct 2000, Andrea Arcangeli wrote:
> On Mon, Oct 09, 2000 at 10:06:02PM +0200, Ingo Molnar wrote:
> > i think the OOM algorithm should not kill processes that have
> > process that has child processes likely results in unexpected behavior of
> 
> You just know what I think about those heuristics. I think all
> we need is a per-task pagefault/allocation rate avoiding any
> other complication that tries to do the right thing but that it
> will end doing the wrong thing eventually, but obviously nobody
> agreeed with me and before I implement that myself it will still
> take some time.

Furthermore, keeping track of these allocations will mean that you
/ALWAYS/ rack up the overhead of keeping track of this, even though
most machines probably won't run out of memory ever, or no more
than twice a year or so ;)

> Even the total_vm information will be wrong for example if the
> task was a netscape iconized and completly swapped out that
> wasn't running since two days. Killing it is going to only delay
> the killing of the real offender that is generating a flood of
> page faults at high frequency.

However true this may be, I wonder if we really care /that/ much.

OOM is a very rare situation and as long as you don't do something
that's really a bad surprise to the user, everything should be ok.

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Ingo Molnar


On Mon, 9 Oct 2000, Rik van Riel wrote:

> > so dns helper is killed first, then netscape. (my idea might not
> > make sense though.)
> 
> It makes some sense, but I don't think OOM is something that
> occurs often enough to care about it /that/ much...

i'm trying to handle Andrea's case, the init=/bin/bash manual-bootup case,
with 4MB RAM and no swap, where the admin tries to exec a 2MB process. I
think it's a legitimate concern - i cannot know in advance whether a
freshly started process would trigger an OOM or not.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread David Ford

Andrea Arcangeli wrote:

> On Mon, Oct 09, 2000 at 12:30:20PM -0700, David Ford wrote:
> > Init should only get killed if it REALLY is taking a lot of memory.  On a 4 or 8meg
>
> Init should never get killed. Killing init can be compared to destroy the TCP
> stack. Some app can keep to run right for some minute until they run socket()
> and then they will hang. Same with init, some task may still run right for
> some time but the machine will die eventually. We simply must not pass the
> point of not return or we're buggy and after the bug triggered we have to force
> the user to reboot the machine as only way to recover.

After 1/2 a second of deep reflection, I concur.  Pretty much all interactive processes
will die immediately.  That just doesn't make for happy penguins.

-d

--
  "There is a natural aristocracy among men. The grounds of this are
  virtue and talents", Thomas Jefferson [1742-1826], 3rd US President



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Ingo Molnar


On Mon, 9 Oct 2000, Rik van Riel wrote:

> Note that the OOM killer already has this code built-in, but it may be

oops, i didnt notice (really!). One comment: 5*HZ in your code is way too
much for counter, and it might break the scheduler in the future. (right
now those counter values are unused, RT priorities start at 1000, so it
cannot cause harm, but one never knows.) Please use MAX_COUNTER instead.

The SCHED_YIELD thing is a nice trick, it should be added to my signal.c
change as well, without the schedule().

> a good idea to have SIGKILL delivery speeded up for every SIGKILL ...

yep.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Mon, 9 Oct 2000, Ingo Molnar wrote:
> On Mon, 9 Oct 2000, Andi Kleen wrote:
> 
> > netscape usually has child processes: the dns helper.
> 
> so dns helper is killed first, then netscape. (my idea might not
> make sense though.)

It makes some sense, but I don't think OOM is something that
occurs often enough to care about it /that/ much...

My algorithm is already complex enough for my tastes (but seems
to work quite well in the sense that it usually picks the "right"
process in one shot and kills the process the user expects to be
killed).

regards,


Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Andrea Arcangeli

On Mon, Oct 09, 2000 at 10:06:02PM +0200, Ingo Molnar wrote:
> i think the OOM algorithm should not kill processes that have
> process that has child processes likely results in unexpected behavior of

You just know what I think about those heuristics. I think all we need is a
per-task pagefault/allocation rate avoiding any other complication that tries
to do the right thing but that it will end doing the wrong thing eventually,
but obviously nobody agreeed with me and before I implement that myself it will
still take some time.

Even the total_vm information will be wrong for example if the task was a
netscape iconized and completly swapped out that wasn't running since two days.
Killing it is going to only delay the killing of the real offender that is
generating a flood of page faults at high frequency.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Mon, 9 Oct 2000, Ingo Molnar wrote:

> what do you think about the attached patch? It increases the effective
> priority of a (kernel-) killed process, and initiates a reschedule, so
> that it gets selected ASAP. (except if there are RT processes around.)
> This should make OOM decisions 'visible' much more quickly.

Note that the OOM killer already has this code built-in,
but it may be a good idea to have SIGKILL delivery speeded
up for every SIGKILL ...

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/


--- linux/kernel/signal.c.orig  Mon Oct  9 12:56:45 2000
+++ linux/kernel/signal.c   Mon Oct  9 13:00:20 2000
@@ -569,6 +569,14 @@
spin_unlock_irqrestore(>sigmask_lock, flags);
return -ESRCH;
}
+   /*
+* Special case, kernel is forcing SIGKILL.
+* Decrease signal delivery latency.
+*/
+   if (sig == SIGKILL && (t->policy == SCHED_OTHER)) {
+   t->counter = MAX_COUNTER;
+   current->need_resched = 1;
+   }
 
if (t->sig->action[sig-1].sa.sa_handler == SIG_IGN)
t->sig->action[sig-1].sa.sa_handler = SIG_DFL;



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Ingo Molnar


On Mon, 9 Oct 2000, Andi Kleen wrote:

> netscape usually has child processes: the dns helper.

so dns helper is killed first, then netscape. (my idea might not make
sense though.)

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Mon, 9 Oct 2000, Andrea Arcangeli wrote:
> On Mon, Oct 09, 2000 at 04:07:32PM -0300, Rik van Riel wrote:
> > No. It's only needed if your OOM algorithm is so crappy that
> > it might end up killing init by mistake.
> 
> The algorithm you posted on the list in this thread will kill
> init if on 4Mbyte machine without swap init is large 3 Mbytes
> and you execute a task that grows over 1M.

This sounds suspiciously like the description of a DEAD system ;)

(in which case you simply don't care if init is being killed or not)

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Rik van Riel

On Mon, 9 Oct 2000, David Ford wrote:

> Then spam the console loudly with printk, but don't destroy the
> whole machine. Init should only get killed if it REALLY is
> taking a lot of memory.  On a 4 or 8meg machine tho, the
> probability of init getting killed is simply too high for
> comfort.  I have never ever seen init start consuming memory
> like this so I'd rather get spammed on the console a LOT then
> have my entire machine instantly go dead.

Please TEST THIS before spreading Wild Rumours(tm)

On 2.2 a /random/ process gets killed when the system gets
tight, so you'll see init killed on (pre-kludge) 2.2 kernels,
but I don't believe you'll see this with 2.4...

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Andi Kleen

On Mon, Oct 09, 2000 at 10:06:02PM +0200, Ingo Molnar wrote:
> 
> On Mon, 9 Oct 2000, Andrea Arcangeli wrote:
> 
> > > No. It's only needed if your OOM algorithm is so crappy that
> > > it might end up killing init by mistake.
> > 
> > The algorithm you posted on the list in this thread will kill init if
> > on 4Mbyte machine without swap init is large 3 Mbytes and you execute
> > a task that grows over 1M.
> 
> i think the OOM algorithm should not kill processes that have
> child-processes, it should first kill child-less 'leaves'. Killing a
> process that has child processes likely results in unexpected behavior of
> those child-processes. (and equals to effective killing of those
> child-processes as well.)

netscape usually has child processes: the dns helper. 

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Ingo Molnar


Rik,

what do you think about the attached patch? It increases the effective
priority of a (kernel-) killed process, and initiates a reschedule, so
that it gets selected ASAP. (except if there are RT processes around.)
This should make OOM decisions 'visible' much more quickly.

Ingo


--- linux/kernel/signal.c.orig  Mon Oct  9 12:56:45 2000
+++ linux/kernel/signal.c   Mon Oct  9 13:00:20 2000
@@ -569,6 +569,14 @@
spin_unlock_irqrestore(>sigmask_lock, flags);
return -ESRCH;
}
+   /*
+* Special case, kernel is forcing SIGKILL.
+* Decrease signal delivery latency.
+*/
+   if (sig == SIGKILL && (t->policy == SCHED_OTHER)) {
+   t->counter = MAX_COUNTER;
+   current->need_resched = 1;
+   }
 
if (t->sig->action[sig-1].sa.sa_handler == SIG_IGN)
t->sig->action[sig-1].sa.sa_handler = SIG_DFL;



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Andrea Arcangeli

On Mon, Oct 09, 2000 at 12:30:20PM -0700, David Ford wrote:
> Init should only get killed if it REALLY is taking a lot of memory.  On a 4 or 8meg

Init should never get killed. Killing init can be compared to destroy the TCP
stack. Some app can keep to run right for some minute until they run socket()
and then they will hang. Same with init, some task may still run right for
some time but the machine will die eventually. We simply must not pass the
point of not return or we're buggy and after the bug triggered we have to force
the user to reboot the machine as only way to recover.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Ingo Molnar


On Mon, 9 Oct 2000, Andrea Arcangeli wrote:

> > No. It's only needed if your OOM algorithm is so crappy that
> > it might end up killing init by mistake.
> 
> The algorithm you posted on the list in this thread will kill init if
> on 4Mbyte machine without swap init is large 3 Mbytes and you execute
> a task that grows over 1M.

i think the OOM algorithm should not kill processes that have
child-processes, it should first kill child-less 'leaves'. Killing a
process that has child processes likely results in unexpected behavior of
those child-processes. (and equals to effective killing of those
child-processes as well.)

But this mechanizm can be abused (a malicious memory hog can create a
child-process just to avoid the OOM-killer) - but there are ways to avoid
this, eg. to add all the 'MM badness' points to children? Ie. a child
which has MM-abuser parent(s) will definitely be killed first.

Ingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Andrea Arcangeli

On Mon, Oct 09, 2000 at 04:07:32PM -0300, Rik van Riel wrote:
> No. It's only needed if your OOM algorithm is so crappy that
> it might end up killing init by mistake.

The algorithm you posted on the list in this thread will kill init if on 4Mbyte
machine without swap init is large 3 Mbytes and you execute a task that grows
over 1M.

So I repeat again: for correctness you should either fix the oom algorithm and
demonstrate with math that it can't kill init or fix the bug using a magic
check.

Since it's not going to be possible to proof that an oom algorithm won't kill
init (also considering init isn't always /sbin/init) the magic check is going
to be the only bugfix possible.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread Marco Colombo

On Mon, 9 Oct 2000, Rik van Riel wrote:

> On Mon, 9 Oct 2000, Marco Colombo wrote:
> > On Fri, 6 Oct 2000, Rik van Riel wrote:
> > 
> > [...]
> > > They are niced because the user thinks them a bit less
> > > important. 
> > 
> > Please don't, this assumption is quite wrong. I use nice just to
> > be 'nice' to other users. I can run my *important* CPU hog
> > simulation nice +10 in order to let other people get more CPU
> > when the need it.
> 
> In that case the time the process has been running and the
> CPU time used will save the process if it's been running for
> a long time.
> 
> Please read the /entire/ algorithm before making rash
> conclusions like this.


What "conclusions"? YOU stated that "They are niced because the user
thinks them a bit less important", and I was only commenting on that.
I've never said your /entire/ algorithm is failing, the point was on
the purpose of the 'nice' level. Users don't use nice to mark less 
important processes. It's completely orthogonal. And if you really
want to correlate nice level and importance, I'd rather say that
niced processes tend to be more important that "normal" processes,
on average.


> If nice is used for important, long-running tasks, the fact
> that they are long-running will save them (and be honest,
> would you really care if a simulation would be killed after
> 5 minutes?  it's only inconvenient if it gets killed after
> a few hours...)

Ok. Now tell me what's the purpose to run your 'ls' at nice +5 at all.
You nice processes that are going to take a while, otherwise nicing
them has hardly a measurable effect, if any.

> > But if you put the logic "niced == not important" somewhere into
> > the kernel, nobody will use nice anymore. I'd rather give a
> > bonus to niced processes.
> 
> This doesn't make ANY sense at all. The objective is to destroy
> the least amount of work, which means giving a bonus to processes
> which have used a lot of CPU time already ... regardless of nice
> value.

'regardless of nice value' is the part I like.

> > all. But my point here is that you do, and you take it as an hint for
> > process importance as percieved by the user that run it, and I believe
> > it's just wrong guessing).
> 
> If you have a better algorithm, feel free to send patches.

No need. Either reverse the weight you give to nice level or just
ignore it, which probably is easier. I agree that giving a bonus to
niced processed it's nearly useless.
As I've written in my previous message, I don't think it's a big
issue. OOM should not happen, full stop. OOM killer is a last resort
measure, so it needs not to be *too* careful.

> 
> regards,
> 
> Rik
> --
> "What you're running that piece of shit Gnome?!?!"
>-- Miguel de Icaza, UKUUG 2000
> 
> http://www.conectiva.com/ http://www.surriel.com/
> 
> 

.TM.
-- 
  /  /   /
 /  /   /   Marco Colombo
___/  ___  /   /  Technical Manager
   /  /   /  ESI s.r.l.
 _/ _/  _/ [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread David Ford

Then spam the console loudly with printk, but don't destroy the whole machine.
Init should only get killed if it REALLY is taking a lot of memory.  On a 4 or 8meg
machine tho, the probability of init getting killed is simply too high for
comfort.  I have never ever seen init start consuming memory like this so I'd
rather get spammed on the console a LOT then have my entire machine instantly go
dead.

We get enough reports about innocuous messages on the console, I'm sure these would
get reported to LKML as well...and in short order as is usual.

-d

Ingo Molnar wrote:

> On Mon, 9 Oct 2000, Andrea Arcangeli wrote:
>
> > On Fri, Oct 06, 2000 at 04:19:55PM -0400, Byron Stanoszek wrote:
> > > In the OOM killer, shouldn't there be a check for PID 1 just to enforce that
> >
> > Init can't be killed in 2.2.x latest, the same bugfix should be forward
> > ported to 2.4.x.
>
> I believe we should not special-case init in this case. If the OOM would
> kill init then we *want* to know about it ASAP, because it's either a bug
> in the OOM code or a memory leak in init. Both things are very bad, and
> ignoring the kill would just preserve those bugs artificially.

--
  "There is a natural aristocracy among men. The grounds of this are
  virtue and talents", Thomas Jefferson [1742-1826], 3rd US President



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread David Ford

Here's an idea, farfetched as it may be.

Page the entire process out to disk into a user defined area, SIGHALT it and use
printk or a kthread/userproc to notify the user that something was kicked out of
the sandbox for playing bad.  The user can add more swap if desired, then use a
userland tool w/ the kthread/userproc to pop it back into memory or destroy it
and SIGCONT.

This allows the user to make almost all decisions about what gets killed and
what remains running.  Necessarily if you're out of disk space as well we'll
have to resort to simply killing.  Hmm, sounds like 2.5.

-d

Jamie Lokier wrote:

> Kurt Garloff wrote:
> > I could not agree more. Normally, you'd better kill a foreground task
> > (running nice 0) than selecting one of those background jobs for some
> > reasons:
> > * The foreground job can be restarted by the interactive user
> >   (Most likely, it will be only netscape anyway)
> > * The background job probably is the more useful one which has been running
> >   since a longer time (computations, ...)
>
> Ick.  A background job that's been running for a long time will be saved
> by that, as Rik pointed out.
>
> If I've got a background process running for 30 minutes, and a Netscape
> with 5 windows open that I'm using (for long or not, doesn't matter),
> guess which one I'd rather died?  Not Netscape -- I'm using that and
> I'll never remember how to find those 5 windows again if it just dies.
>
> -- Jamie
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/

--
  "There is a natural aristocracy among men. The grounds of this are
  virtue and talents", Thomas Jefferson [1742-1826], 3rd US President



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] VM fix for 2.4.0-test9 & OOM handler

2000-10-09 Thread David Ford

Rik van Riel wrote:

> On Mon, 9 Oct 2000, Marco Colombo wrote:
> > On Fri, 6 Oct 2000, Rik van Riel wrote:
> >
> > [...]
> > > They are niced because the user thinks them a bit less
> > > important.
> >
> > Please don't, this assumption is quite wrong. I use nice just to
> > be 'nice' to other users. I can run my *important* CPU hog
> > simulation nice +10 in order to let other people get more CPU
> > when the need it.
>
> In that case the time the process has been running and the
> CPU time used will save the process if it's been running for
> a long time.

Please base this on more on real time, not CPU time.  Netscrape consumes an
ungodly amount of CPU time and memory and I'd much rather have it killed
before anything else on the system.  If it wasn't blatantly obvious, I'd
check the argv[0] to see if it was "netscape" and kill it.  :]


> This doesn't make ANY sense at all. The objective is to destroy
> the least amount of work, which means giving a bonus to processes
> which have used a lot of CPU time already ... regardless of nice
> value.

But that favors ill written programs if it's based on CPU time.  I.e.
netscape and all the gnome/kde "tasklets" that take 8 megs w/ 6megs RSS and
.9% of a pIII-450 to manage the keyboard and mouse properties.


> If you have a better algorithm, feel free to send patches.

Step A) A heuristic that tags the program or process group consuming the most
RAM in the last N real minutes.  A run-away process is the highest target
here and is most correctly the guilty party.

Step B) A sliding scale of the youngest real program or process group
consuming the most RAM.

Step C) ...

-d

--
  "There is a natural aristocracy among men. The grounds of this are
  virtue and talents", Thomas Jefferson [1742-1826], 3rd US President



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



  1   2   >