Re: killed: out of swap
b...@softjar.se (Johnny Billquist) writes: >I don't see any realistic way of doing anything with that. >It's basically the first process that tries to allocate another page >when there are no more. There are no other processes at that moment in >time that have the problem, so why should any of them be considered? They might be the reason for the memory shortage. You can prefer large processes as victims or protect system services to keep the system managable.
Re: killed: out of swap
>> > I have a program that keeps malloc()ing (and scribbling a bit >> > into the allocated memory) until malloc() fails. The >> > intention is to put pressure on the VM system to find out how >> > much pool cache memory it can reclaim. >> Such a program would be a prime candidate for declaring itself a >> preferred out-of-swap victim. >> Perhaps even better would be a way for userland to tell the kernel >> "pretend you're under severe RAM pressure and free what you can" >> without needing to actually run the system out of pages. > Is this something madvise(2) could be extended to do? In principle? It could be, sure. I would, however, suggest something else. These operations do not apply to specific ranges of VM, so an API that's designed for actions on particular ranges of VM seems inappropriate. For the former suggestion, for declaring a process a preferred (or maybe even dispreferred) out-of-swap victim, sysctl's proc.$PID hierarchy strikes me as a better tool, especially because root can then protect certain processes from the command line, by making them less preferred. I didn't go into details of what I was imagining. I was thinking of each process having a preference; when the kernel runs out of swap, instead of simply killing the faulting process it would kill one of the processes with the maximum victim preference value, perhaps a signed int with the default being zero or some such. Then root could set preference negative for particularly important processes, or positive for fluff, or very positive for things like the "try to reclaim RAM" process outlined. I would say it should be like nice: any process can make itself get less-preferred treatment, but it should take privilege to go the other way. For the latter suggestion, for provoking what reclaims are possible without actually risking running out of pages...I'm not sure. A new syscall? A new sysctl? An ioctl on /dev/mem? There are many possibilities. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: killed: out of swap
On Tue, Jun 14, 2022 at 07:59:33PM +0200, Johnny Billquist wrote: > > What might be interesting is a way to influence the order in which > > processes are chosen to kill... > > I don't see any realistic way of doing anything with that. > It's basically the first process that tries to allocate another page when > there are no more. There are no other processes at that moment in time that > have the problem, so why should any of them be considered? There are certainly ways. The history of the linux oom-killer makes for pretty good reading, actually. You want to find the process that's actually causing the problem, not some random process that happens to be the one to step in the hole. NetBSD doesn't do this very well; usually in my experience an OOM situation results in syslogd getting killed, which is both unhelpful in terms of freeing up resources and actively bad for other reasons. It's also not uncommon for the ax to fall on the X server, which does usually end up releasing resources but is pretty much never what the user wants. -- David A. Holland dholl...@netbsd.org
Re: killed: out of swap
>> > I have a program that keeps malloc()ing (and scribbling a bit >> > into the allocated memory) until malloc() fails. The >> > intention is to put pressure on the VM system to find out how >> > much pool cache memory it can reclaim. >> Such a program would be a prime candidate for declaring itself a >> preferred out-of-swap victim. > Well. First of all, such a program don't exist, as the malloc is not > failing. Strictly speaking, yes. I was talking about a program that satisfies the intention, rather than the letter of the description. In this case, something like "a program that keeps grabbing and touching memory as long as it can, to provoke memory reclaims in the kernel", with its death being due to out-of-swap rather than as a reaction to a failing malloc. >> It probably wouldn't be easy - the process which incurred the page >> fault would have to be put to sleep pending the death of the victim >> process - but it could provide for much better behaviour in >> situations like this. > Second - are you proposing that you'd keep some kind of statistics on > mallocs done in the past in some way, in order to decide that this > process should now be a candidate for a kill when you run out of > pages? No. Trying to do something "smart" with process behaviour history strikes me as a disaster waiting to happen. What I'm suggesting is that a program with the intent sketched above should do something like set_swap_kill_preference(1); (or whatever the chosen API is - perhaps using sysctl?) in order to indicate to the kernel that it is volunteering itself as a preferred out-of-swap victim, even if it's not the process that incurs the problematic page fault. > Even more, how do we decide that it's actually malloc, or are any > memory demand treated equally? It's not truly malloc(); like most of this discussion, it's really talking about a process that needs a new page to satisfy a write to a COW or demand-zero page when there are no pages available. To put it loosely, anything that could provoke an out-of-swap SIGKILL from the kernel. >> Perhaps even better would be a way for userland to tell the kernel >> "pretend you're under severe RAM pressure and free what you can" >> without needing to actually run the system out of pages. > Well, the process killing only happens when we are really out of > pages, so no amount of "free what you can" helps (unless I'm > confused). This was an attempt to address not the symptom but the underlying desire. In this case, the original poster was saying, approximately, "I wanted to do $THING and tried to do so [...outline...], and am having $PROBLEM". Most of this thread has been talking about addressing $PROBLEM. The above quote of mine ("Perhaps even better...") was about addressing the desire for $THING differently, in a way that doesn't risk provoking $PROBLEM. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: killed: out of swap
hello. Is this something madvise(2) could be extended to do? -thanks -Brian On Jun 14, 2:47pm, Mouse wrote: } Subject: Re: killed: out of swap } >> What might be interesting is a way to influence the order in which } >> processes are chosen to kill... } > I don't see any realistic way of doing anything with that. It's } > basically the first process that tries to allocate another page when } > there are no more. There are no other processes at that moment in } > time that have the problem, so why should any of them be considered? } } To answer that, consider the original poster's situation: } } > I have a program that keeps malloc()ing (and scribbling a bit } > into the allocated memory) until malloc() fails. The } > intention is to put pressure on the VM system to find out how } > much pool cache memory it can reclaim. } } Such a program would be a prime candidate for declaring itself a } preferred out-of-swap victim. SunOS chill(1) - or was it chill(8)? - } might be another example, though that's of minimal relevance to NetBSD. } } It probably wouldn't be easy - the process which incurred the page } fault would have to be put to sleep pending the death of the victim } process - but it could provide for much better behaviour in situations } like this. } } Perhaps even better would be a way for userland to tell the kernel } "pretend you're under severe RAM pressure and free what you can" } without needing to actually run the system out of pages. } } /~\ The ASCII Mouse } \ / Ribbon Campaign } X Against HTML mo...@rodents-montreal.org } / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B >-- End of excerpt from Mouse
Re: killed: out of swap
On 2022-06-14 20:47, Mouse wrote: What might be interesting is a way to influence the order in which processes are chosen to kill... I don't see any realistic way of doing anything with that. It's basically the first process that tries to allocate another page when there are no more. There are no other processes at that moment in time that have the problem, so why should any of them be considered? To answer that, consider the original poster's situation: > I have a program that keeps malloc()ing (and scribbling a bit > into the allocated memory) until malloc() fails. The > intention is to put pressure on the VM system to find out how > much pool cache memory it can reclaim. Such a program would be a prime candidate for declaring itself a preferred out-of-swap victim. SunOS chill(1) - or was it chill(8)? - might be another example, though that's of minimal relevance to NetBSD. Well. First of all, such a program don't exist, as the malloc is not failing. It probably wouldn't be easy - the process which incurred the page fault would have to be put to sleep pending the death of the victim process - but it could provide for much better behaviour in situations like this. Second - are you proposing that you'd keep some kind of statistics on mallocs done in the past in some way, in order to decide that this process should now be a candidate for a kill when you run out of pages? Are you sure it is a better candidate than the current process, who actually are demanding more pages at the moment? The other process might not in fact ever be hitting the condition, and would run on happily ever after, even though it did call malloc a number of times earlier. Even more, how do we decide that it's actually malloc, or are any memory demand treated equally? And yes, it also raises the question on how to handle the current process that caused a page demand that cannot be fulfilled at the moment. Perhaps even better would be a way for userland to tell the kernel "pretend you're under severe RAM pressure and free what you can" without needing to actually run the system out of pages. Well, the process killing only happens when we are really out of pages, so no amount of "free what you can" helps (unless I'm confused). Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: killed: out of swap
>> What might be interesting is a way to influence the order in which >> processes are chosen to kill... > I don't see any realistic way of doing anything with that. It's > basically the first process that tries to allocate another page when > there are no more. There are no other processes at that moment in > time that have the problem, so why should any of them be considered? To answer that, consider the original poster's situation: > I have a program that keeps malloc()ing (and scribbling a bit > into the allocated memory) until malloc() fails. The > intention is to put pressure on the VM system to find out how > much pool cache memory it can reclaim. Such a program would be a prime candidate for declaring itself a preferred out-of-swap victim. SunOS chill(1) - or was it chill(8)? - might be another example, though that's of minimal relevance to NetBSD. It probably wouldn't be easy - the process which incurred the page fault would have to be put to sleep pending the death of the victim process - but it could provide for much better behaviour in situations like this. Perhaps even better would be a way for userland to tell the kernel "pretend you're under severe RAM pressure and free what you can" without needing to actually run the system out of pages. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: killed: out of swap
On 2022-06-14 19:57, David Brownlee wrote: On Tue, 14 Jun 2022 at 13:33, Robert Elz wrote: NetBSD implements overcommitted swap - many processes malloc() (or mmap() which that really becomes in the current implementation) far more memory than they're ever going to actually use. It is only when some real physical memory is required (rather than simply a marker "zero filled page might be required here") that the system actually allocates any real resources. Similarly pages mapped from a file only need swap space if they're altered - otherwise the file serves as the backing store for it. Once upon a time there was a method to turn overcommitted swap off, and require actual allocations (of RAM or swap) to be made for all reserved (virtual) memory. I used to enable that all the time - but I haven't seen any mention of it in ages, and the mechanism might no longer still exist. What might be interesting is a way to influence the order in which processes are chosen to kill... I don't see any realistic way of doing anything with that. It's basically the first process that tries to allocate another page when there are no more. There are no other processes at that moment in time that have the problem, so why should any of them be considered? Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: killed: out of swap
On Tue, 14 Jun 2022 at 13:33, Robert Elz wrote: > > NetBSD implements overcommitted swap - many processes malloc() > (or mmap() which that really becomes in the current implementation) > far more memory than they're ever going to actually use. It is only > when some real physical memory is required (rather than simply a marker > "zero filled page might be required here") that the system actually > allocates any real resources. Similarly pages mapped from a file only > need swap space if they're altered - otherwise the file serves as the > backing store for it. > > Once upon a time there was a method to turn overcommitted swap off, and > require actual allocations (of RAM or swap) to be made for all reserved > (virtual) memory. I used to enable that all the time - but I haven't seen > any mention of it in ages, and the mechanism might no longer still exist. What might be interesting is a way to influence the order in which processes are chosen to kill... David
Re: killed: out of swap
> I assume my impression is completely wrong (today). OK, thanks for all the explanations and insights.
Re: killed: out of swap
NetBSD implements overcommitted swap - many processes malloc() (or mmap() which that really becomes in the current implementation) far more memory than they're ever going to actually use. It is only when some real physical memory is required (rather than simply a marker "zero filled page might be required here") that the system actually allocates any real resources. Similarly pages mapped from a file only need swap space if they're altered - otherwise the file serves as the backing store for it. Once upon a time there was a method to turn overcommitted swap off, and require actual allocations (of RAM or swap) to be made for all reserved (virtual) memory. I used to enable that all the time - but I haven't seen any mention of it in ages, and the mechanism might no longer still exist. kre
Re: killed: out of swap
> I have a program that keeps malloc()ing (and scribbling a bit into > the allocated memory) until malloc() fails. The intention is to put > pressure on the VM system to find out how much pool cache memory it > can reclaim. > When I run that program (with swap space unconfigured), it doesn't > terminate normally, but gets killed by the kernel with "out of swap". > Unfortunately, other processes happening to malloc() during that time > may get killed, too. As I think someone else said, it doesn't die in malloc, but in a page fault incurred when accessing the new memory. Other processes that get killed also get killed in page faults, not in malloc()s. > I don't quite get what the rationale for that is (or maybe I'm doing > something stupidely wrong). If I malloc(), and that fails, that > should fail and not kill me, no? One would think. But the system overcommits VM, allowing the total VM to exceed the available RAM (well, RAM+swap, but you said you had no swap configured). This is semi-necessary, to support things like a large process doing fork+exec (where almost all the VM is COW-shared between the fork and the exec, but doubling the VM use of the process would exceed available RAM). But it does mean that trying to allocate a new page, usually during a write pagefault to a COW or demand-zero page, can leave the OS with no available pages. Since there is no way to return an error to the "offending" process, the only alternatives are to kill the process, kill some _other_ process, or put the faulting process to sleep indefinitely in the hope that someone else will free up some memory. Someone chose the first of those options. (It is arguably the best of a rather bad lot.) I don't know whether there is a way to turn off VM overcommit. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: killed: out of swap
On 2022-06-14 12:59, Edgar Fuß wrote: So what should the kernel do? I don't know how thigs work under the hood today (I might have partially known in the times of sbrk()), but I would suppose that malloc() will ultimatively result in some system call enlarging the heap/data segment/whatever. That system call could simply fail. I assume my impression is completely wrong (today). But then, how can a malloc() fail before the process gets killed? Process limits for one. But I guess if your virtual memory becomes fragmented, and you request a too big chunk would be another reason. But malloc today relies on the lazy memory grabbing of the pager. Until you actually reference the memory, it don't yet have to be backed by anything. (Unless I remember something wrong.) Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: killed: out of swap
> So what should the kernel do? I don't know how thigs work under the hood today (I might have partially known in the times of sbrk()), but I would suppose that malloc() will ultimatively result in some system call enlarging the heap/data segment/whatever. That system call could simply fail. I assume my impression is completely wrong (today). But then, how can a malloc() fail before the process gets killed?
Re: killed: out of swap
It's not the malloc that fails. It's the vm system trying to get a page for you. At which point it might not be your process that is trying to get a page when there are none free... So what should the kernel do? Johnny On 2022-06-14 12:01, Edgar Fuß wrote: I have a program that keeps malloc()ing (and scribbling a bit into the allocated memory) until malloc() fails. The intention is to put pressure on the VM system to find out how much pool cache memory it can reclaim. When I run that program (with swap space unconfigured), it doesn't terminate normally, but gets killed by the kernel with "out of swap". Unfortunately, other processes happening to malloc() during that time may get killed, too. I don't quite get what the rationale for that is (or maybe I'm doing something stupidely wrong). If I malloc(), and that fails, that should fail and not kill me, no? I'm surely missing something. -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
killed: out of swap
I have a program that keeps malloc()ing (and scribbling a bit into the allocated memory) until malloc() fails. The intention is to put pressure on the VM system to find out how much pool cache memory it can reclaim. When I run that program (with swap space unconfigured), it doesn't terminate normally, but gets killed by the kernel with "out of swap". Unfortunately, other processes happening to malloc() during that time may get killed, too. I don't quite get what the rationale for that is (or maybe I'm doing something stupidely wrong). If I malloc(), and that fails, that should fail and not kill me, no? I'm surely missing something.