Re: killed: out of swap

2022-06-14 Thread Michael van Elst
b...@softjar.se (Johnny Billquist) writes:

>I don't see any realistic way of doing anything with that.
>It's basically the first process that tries to allocate another page 
>when there are no more. There are no other processes at that moment in 
>time that have the problem, so why should any of them be considered?

They might be the reason for the memory shortage. You can prefer large
processes as victims or protect system services to keep the system
managable.



Re: killed: out of swap

2022-06-14 Thread Mouse
>>  > I have a program that keeps malloc()ing (and scribbling a bit
>>  > into the allocated memory) until malloc() fails. The
>>  > intention is to put pressure on the VM system to find out how
>>  > much pool cache memory it can reclaim.

>> Such a program would be a prime candidate for declaring itself a
>> preferred out-of-swap victim.

>> Perhaps even better would be a way for userland to tell the kernel
>> "pretend you're under severe RAM pressure and free what you can"
>> without needing to actually run the system out of pages.

> Is this something madvise(2) could be extended to do?

In principle?  It could be, sure.

I would, however, suggest something else.  These operations do not
apply to specific ranges of VM, so an API that's designed for actions
on particular ranges of VM seems inappropriate.

For the former suggestion, for declaring a process a preferred (or
maybe even dispreferred) out-of-swap victim, sysctl's proc.$PID
hierarchy strikes me as a better tool, especially because root can then
protect certain processes from the command line, by making them less
preferred.

I didn't go into details of what I was imagining.  I was thinking of
each process having a preference; when the kernel runs out of swap,
instead of simply killing the faulting process it would kill one of the
processes with the maximum victim preference value, perhaps a signed
int with the default being zero or some such.  Then root could set
preference negative for particularly important processes, or positive
for fluff, or very positive for things like the "try to reclaim RAM"
process outlined.  I would say it should be like nice: any process can
make itself get less-preferred treatment, but it should take privilege
to go the other way.

For the latter suggestion, for provoking what reclaims are possible
without actually risking running out of pages...I'm not sure.  A new
syscall?  A new sysctl?  An ioctl on /dev/mem?  There are many
possibilities.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: killed: out of swap

2022-06-14 Thread David Holland
On Tue, Jun 14, 2022 at 07:59:33PM +0200, Johnny Billquist wrote:
 > > What might be interesting is a way to influence the order in which
 > > processes are chosen to kill...
 > 
 > I don't see any realistic way of doing anything with that.
 > It's basically the first process that tries to allocate another page when
 > there are no more. There are no other processes at that moment in time that
 > have the problem, so why should any of them be considered?

There are certainly ways. The history of the linux oom-killer makes
for pretty good reading, actually.

You want to find the process that's actually causing the problem, not
some random process that happens to be the one to step in the hole.

NetBSD doesn't do this very well; usually in my experience an OOM
situation results in syslogd getting killed, which is both unhelpful
in terms of freeing up resources and actively bad for other reasons.
It's also not uncommon for the ax to fall on the X server, which does
usually end up releasing resources but is pretty much never what the
user wants.

-- 
David A. Holland
dholl...@netbsd.org


Re: killed: out of swap

2022-06-14 Thread Mouse
>>  > I have a program that keeps malloc()ing (and scribbling a bit
>>  > into the allocated memory) until malloc() fails.  The
>>  > intention is to put pressure on the VM system to find out how
>>  > much pool cache memory it can reclaim.
>> Such a program would be a prime candidate for declaring itself a
>> preferred out-of-swap victim.
> Well.  First of all, such a program don't exist, as the malloc is not
> failing.

Strictly speaking, yes.  I was talking about a program that satisfies
the intention, rather than the letter of the description.  In this
case, something like "a program that keeps grabbing and touching memory
as long as it can, to provoke memory reclaims in the kernel", with its
death being due to out-of-swap rather than as a reaction to a failing
malloc.

>> It probably wouldn't be easy - the process which incurred the page
>> fault would have to be put to sleep pending the death of the victim
>> process - but it could provide for much better behaviour in
>> situations like this.
> Second - are you proposing that you'd keep some kind of statistics on
> mallocs done in the past in some way, in order to decide that this
> process should now be a candidate for a kill when you run out of
> pages?

No.  Trying to do something "smart" with process behaviour history
strikes me as a disaster waiting to happen.

What I'm suggesting is that a program with the intent sketched above
should do something like
set_swap_kill_preference(1);
(or whatever the chosen API is - perhaps using sysctl?) in order to
indicate to the kernel that it is volunteering itself as a preferred
out-of-swap victim, even if it's not the process that incurs the
problematic page fault.

> Even more, how do we decide that it's actually malloc, or are any
> memory demand treated equally?

It's not truly malloc(); like most of this discussion, it's really
talking about a process that needs a new page to satisfy a write to a
COW or demand-zero page when there are no pages available.

To put it loosely, anything that could provoke an out-of-swap SIGKILL
from the kernel.

>> Perhaps even better would be a way for userland to tell the kernel
>> "pretend you're under severe RAM pressure and free what you can"
>> without needing to actually run the system out of pages.
> Well, the process killing only happens when we are really out of
> pages, so no amount of "free what you can" helps (unless I'm
> confused).

This was an attempt to address not the symptom but the underlying
desire.  In this case, the original poster was saying, approximately,
"I wanted to do $THING and tried to do so [...outline...], and am
having $PROBLEM".  Most of this thread has been talking about
addressing $PROBLEM.  The above quote of mine ("Perhaps even
better...") was about addressing the desire for $THING differently, in
a way that doesn't risk provoking $PROBLEM.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: killed: out of swap

2022-06-14 Thread Brian Buhrow
hello.  Is this something madvise(2) could be extended to do?
-thanks
-Brian

On Jun 14,  2:47pm, Mouse wrote:
} Subject: Re: killed: out of swap
} >> What might be interesting is a way to influence the order in which
} >> processes are chosen to kill...
} > I don't see any realistic way of doing anything with that.  It's
} > basically the first process that tries to allocate another page when
} > there are no more.  There are no other processes at that moment in
} > time that have the problem, so why should any of them be considered?
} 
} To answer that, consider the original poster's situation:
} 
}   > I have a program that keeps malloc()ing (and scribbling a bit
}   > into the allocated memory) until malloc() fails. The
}   > intention is to put pressure on the VM system to find out how
}   > much pool cache memory it can reclaim.
} 
} Such a program would be a prime candidate for declaring itself a
} preferred out-of-swap victim.  SunOS chill(1) - or was it chill(8)? -
} might be another example, though that's of minimal relevance to NetBSD.
} 
} It probably wouldn't be easy - the process which incurred the page
} fault would have to be put to sleep pending the death of the victim
} process - but it could provide for much better behaviour in situations
} like this.
} 
} Perhaps even better would be a way for userland to tell the kernel
} "pretend you're under severe RAM pressure and free what you can"
} without needing to actually run the system out of pages.
} 
} /~\ The ASCII   Mouse
} \ / Ribbon Campaign
}  X  Against HTML  mo...@rodents-montreal.org
} / \ Email! 7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B
>-- End of excerpt from Mouse




Re: killed: out of swap

2022-06-14 Thread Johnny Billquist




On 2022-06-14 20:47, Mouse wrote:

What might be interesting is a way to influence the order in which
processes are chosen to kill...

I don't see any realistic way of doing anything with that.  It's
basically the first process that tries to allocate another page when
there are no more.  There are no other processes at that moment in
time that have the problem, so why should any of them be considered?


To answer that, consider the original poster's situation:

> I have a program that keeps malloc()ing (and scribbling a bit
> into the allocated memory) until malloc() fails. The
> intention is to put pressure on the VM system to find out how
> much pool cache memory it can reclaim.

Such a program would be a prime candidate for declaring itself a
preferred out-of-swap victim.  SunOS chill(1) - or was it chill(8)? -
might be another example, though that's of minimal relevance to NetBSD.


Well. First of all, such a program don't exist, as the malloc is not 
failing.



It probably wouldn't be easy - the process which incurred the page
fault would have to be put to sleep pending the death of the victim
process - but it could provide for much better behaviour in situations
like this.


Second - are you proposing that you'd keep some kind of statistics on 
mallocs done in the past in some way, in order to decide that this 
process should now be a candidate for a kill when you run out of pages? 
Are you sure it is a better candidate than the current process, who 
actually are demanding more pages at the moment? The other process might 
not in fact ever be hitting the condition, and would run on happily ever 
after, even though it did call malloc a number of times earlier.


Even more, how do we decide that it's actually malloc, or are any memory 
demand treated equally?


And yes, it also raises the question on how to handle the current 
process that caused a page demand that cannot be fulfilled at the moment.



Perhaps even better would be a way for userland to tell the kernel
"pretend you're under severe RAM pressure and free what you can"
without needing to actually run the system out of pages.


Well, the process killing only happens when we are really out of pages, 
so no amount of "free what you can" helps (unless I'm confused).


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: killed: out of swap

2022-06-14 Thread Mouse
>> What might be interesting is a way to influence the order in which
>> processes are chosen to kill...
> I don't see any realistic way of doing anything with that.  It's
> basically the first process that tries to allocate another page when
> there are no more.  There are no other processes at that moment in
> time that have the problem, so why should any of them be considered?

To answer that, consider the original poster's situation:

> I have a program that keeps malloc()ing (and scribbling a bit
> into the allocated memory) until malloc() fails. The
> intention is to put pressure on the VM system to find out how
> much pool cache memory it can reclaim.

Such a program would be a prime candidate for declaring itself a
preferred out-of-swap victim.  SunOS chill(1) - or was it chill(8)? -
might be another example, though that's of minimal relevance to NetBSD.

It probably wouldn't be easy - the process which incurred the page
fault would have to be put to sleep pending the death of the victim
process - but it could provide for much better behaviour in situations
like this.

Perhaps even better would be a way for userland to tell the kernel
"pretend you're under severe RAM pressure and free what you can"
without needing to actually run the system out of pages.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: killed: out of swap

2022-06-14 Thread Johnny Billquist

On 2022-06-14 19:57, David Brownlee wrote:

On Tue, 14 Jun 2022 at 13:33, Robert Elz  wrote:


NetBSD implements overcommitted swap - many processes malloc()
(or mmap() which that really becomes in the current implementation)
far more memory than they're ever going to actually use.  It is only
when some real physical memory is required (rather than simply a marker
"zero filled page might be required here") that the system actually
allocates any real resources.   Similarly pages mapped from a file only
need swap space if they're altered - otherwise the file serves as the
backing store for it.

Once upon a time there was a method to turn overcommitted swap off, and
require actual allocations (of RAM or swap) to be made for all reserved
(virtual) memory.  I used to enable that all the time - but I haven't seen
any mention of it in ages, and the mechanism might no longer still exist.


What might be interesting is a way to influence the order in which
processes are chosen to kill...


I don't see any realistic way of doing anything with that.
It's basically the first process that tries to allocate another page 
when there are no more. There are no other processes at that moment in 
time that have the problem, so why should any of them be considered?


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: killed: out of swap

2022-06-14 Thread David Brownlee
On Tue, 14 Jun 2022 at 13:33, Robert Elz  wrote:
>
> NetBSD implements overcommitted swap - many processes malloc()
> (or mmap() which that really becomes in the current implementation)
> far more memory than they're ever going to actually use.  It is only
> when some real physical memory is required (rather than simply a marker
> "zero filled page might be required here") that the system actually
> allocates any real resources.   Similarly pages mapped from a file only
> need swap space if they're altered - otherwise the file serves as the
> backing store for it.
>
> Once upon a time there was a method to turn overcommitted swap off, and
> require actual allocations (of RAM or swap) to be made for all reserved
> (virtual) memory.  I used to enable that all the time - but I haven't seen
> any mention of it in ages, and the mechanism might no longer still exist.

What might be interesting is a way to influence the order in which
processes are chosen to kill...

David


Re: killed: out of swap

2022-06-14 Thread Edgar Fuß
> I assume my impression is completely wrong (today).
OK, thanks for all the explanations and insights.


Re: killed: out of swap

2022-06-14 Thread Robert Elz
NetBSD implements overcommitted swap - many processes malloc()
(or mmap() which that really becomes in the current implementation)
far more memory than they're ever going to actually use.  It is only
when some real physical memory is required (rather than simply a marker
"zero filled page might be required here") that the system actually
allocates any real resources.   Similarly pages mapped from a file only
need swap space if they're altered - otherwise the file serves as the
backing store for it.

Once upon a time there was a method to turn overcommitted swap off, and
require actual allocations (of RAM or swap) to be made for all reserved
(virtual) memory.  I used to enable that all the time - but I haven't seen
any mention of it in ages, and the mechanism might no longer still exist.

kre



Re: killed: out of swap

2022-06-14 Thread Mouse
> I have a program that keeps malloc()ing (and scribbling a bit into
> the allocated memory) until malloc() fails.  The intention is to put
> pressure on the VM system to find out how much pool cache memory it
> can reclaim.

> When I run that program (with swap space unconfigured), it doesn't
> terminate normally, but gets killed by the kernel with "out of swap".
> Unfortunately, other processes happening to malloc() during that time
> may get killed, too.

As I think someone else said, it doesn't die in malloc, but in a page
fault incurred when accessing the new memory.

Other processes that get killed also get killed in page faults, not in
malloc()s.

> I don't quite get what the rationale for that is (or maybe I'm doing
> something stupidely wrong).  If I malloc(), and that fails, that
> should fail and not kill me, no?

One would think.

But the system overcommits VM, allowing the total VM to exceed the
available RAM (well, RAM+swap, but you said you had no swap
configured).  This is semi-necessary, to support things like a large
process doing fork+exec (where almost all the VM is COW-shared between
the fork and the exec, but doubling the VM use of the process would
exceed available RAM).  But it does mean that trying to allocate a new
page, usually during a write pagefault to a COW or demand-zero page,
can leave the OS with no available pages.  Since there is no way to
return an error to the "offending" process, the only alternatives are
to kill the process, kill some _other_ process, or put the faulting
process to sleep indefinitely in the hope that someone else will free
up some memory.  Someone chose the first of those options.  (It is
arguably the best of a rather bad lot.)

I don't know whether there is a way to turn off VM overcommit.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: killed: out of swap

2022-06-14 Thread Johnny Billquist

On 2022-06-14 12:59, Edgar Fuß wrote:

So what should the kernel do?

I don't know how thigs work under the hood today (I might have partially
known in the times of sbrk()), but I would suppose that malloc() will
ultimatively result in some system call enlarging the heap/data
segment/whatever. That system call could simply fail.

I assume my impression is completely wrong (today). But then, how can
a malloc() fail before the process gets killed?


Process limits for one. But I guess if your virtual memory becomes 
fragmented, and you request a too big chunk would be another reason.


But malloc today relies on the lazy memory grabbing of the pager. Until 
you actually reference the memory, it don't yet have to be backed by 
anything. (Unless I remember something wrong.)


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: killed: out of swap

2022-06-14 Thread Edgar Fuß
> So what should the kernel do?
I don't know how thigs work under the hood today (I might have partially 
known in the times of sbrk()), but I would suppose that malloc() will 
ultimatively result in some system call enlarging the heap/data 
segment/whatever. That system call could simply fail.

I assume my impression is completely wrong (today). But then, how can 
a malloc() fail before the process gets killed?


Re: killed: out of swap

2022-06-14 Thread Johnny Billquist
It's not the malloc that fails. It's the vm system trying to get a page 
for you. At which point it might not be your process that is trying to 
get a page when there are none free... So what should the kernel do?


  Johnny

On 2022-06-14 12:01, Edgar Fuß wrote:

I have a program that keeps malloc()ing (and scribbling a bit into the
allocated memory) until malloc() fails. The intention is to put pressure
on the VM system to find out how much pool cache memory it can reclaim.

When I run that program (with swap space unconfigured), it doesn't terminate
normally, but gets killed by the kernel with "out of swap". Unfortunately,
other processes happening to malloc() during that time may get killed, too.

I don't quite get what the rationale for that is (or maybe I'm doing
something stupidely wrong). If I malloc(), and that fails, that should fail
and not kill me, no?

I'm surely missing something.


--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


killed: out of swap

2022-06-14 Thread Edgar Fuß
I have a program that keeps malloc()ing (and scribbling a bit into the 
allocated memory) until malloc() fails. The intention is to put pressure 
on the VM system to find out how much pool cache memory it can reclaim.

When I run that program (with swap space unconfigured), it doesn't terminate 
normally, but gets killed by the kernel with "out of swap". Unfortunately, 
other processes happening to malloc() during that time may get killed, too.

I don't quite get what the rationale for that is (or maybe I'm doing 
something stupidely wrong). If I malloc(), and that fails, that should fail 
and not kill me, no?

I'm surely missing something.