re: UVM behavior under memory pressure

2021-04-04 Thread matthew green
one additional thing about the behaviour of vm.*{min,max}
is that they do not (currently?) consider how much memory
is consumed by major kernel consumers like pools.  eg, there
is a known radeondrmkms leak we have not figured out why it
happens only sometimes, and this ends up making the kmem-160
or kmem-192 (size depends on DIAGNOSTIC, IIRC) pool grow,
possibly using serious percentage of the total memory, which
can end up meaning that adding vm.*min to 95% actually means
way more than 100% of available memory.

so, also check what "vmstat -m" says pools are using.

(we should fix this.  anyone want to define what the right
solution is? :-)


.mrg.


Re: UVM behavior under memory pressure

2021-04-02 Thread Frank Kardel

This reminds me of a major cleanup Andrew Doran did in -current.

 PR kern/54209: NetBSD 8 large memory performance extremely low
 PR kern/54210: NetBSD-8 processes presumably not exiting
 PR kern/54727: writing a large file causes unreasonable system behaviour

The logic in -current has dramatically changed so that we didn't opt
for backporting that to -9 and -8.

Have you tried the with -current?

Best regards,
  Frank


On 04/01/21 21:03, Manuel Bouyer wrote:

hello
on a system running netbsd-9 from mid-august, I ended up in a state where
the system has little free memory, no free swap and almost 40% of RAM used
by the file cache:

load averages:  9.00,  5.02,  2.86;   up 0+11:30:5620:57:39
97 processes: 2 runnable, 91 sleeping, 1 stopped, 3 on CPU
CPU states:  0.7% user,  0.0% nice, 96.5% system,  0.0% interrupt,  2.6% idle
Memory: 4987M Act, 2436M Inact, 123M Wired, 198M Exec, 2918M File, 4216K Free
Swap: 520M Total, 520M Used, 4K Free

   PID USERNAME PRI NICE   SIZE   RES STATE  TIME   WCPUCPU COMMAND
 0 root   00 0K   30M CPU/3 11:27  0.00%   122% [system]
  1009 bouyer430  2978M  275M parked/0   9:32 27.78% 27.78% firefox68
  1519 bouyer260  2583M  141M RUN/0  0:37 20.90% 20.90% firefox68
  1243 bouyer250  2912M  350M RUN/2  2:56 60.96% 20.12% firefox68
  7341 bouyer260  3221M 2807M CPU/2  5:07 17.09% 17.09% openscad
   729 bouyer760   202M   39M select/3  26:03 16.65% 16.65% X

Of course the system is very slow
Shouldn't UVM choose, in this case, to reclaim pages from the file cache
for the process data ?
I'm using the default vm.* sysctl values.





Re: UVM behavior under memory pressure

2021-04-01 Thread Greg A. Woods
At Thu, 1 Apr 2021 23:15:42 +0200, Manuel Bouyer  wrote:
Subject: Re: UVM behavior under memory pressure
>
> Yes, I understand this. But, in an emergency situation like this one (there
> is no free ram, swap is full, openscad eventually gets killed),
> I would expect the pager to reclaim pages where it can;
> like file cache (down to vm.filemin, I agree it shouldn't go down to 0).
>
> In my case, vm.anonmax is at 80%, and I suspect it was not reached
> (I tried to increase it to 90% but this didn't change anything).

As I understand things there's no point to increasing any vm.*max value
unless it is already way too low and you want more memory to be used for
that category and there's not already more use in other categories
(i.e. where a competing vm.*max value is too high).

It is the vm.*min value for the desired category that isn't high enough
to allow that category to claim more pages from the less desired
categories.

I.e. if vm.anonmin is too low, and I believe the default of 10% is way
too low, then when file I/O gets busy for whatever reason, (and with the
default rather high vm.filemax value) large processes _will_ get
partially paged out as only 10% of their memory will be kept activated.

Simultaneously decreasing vm.filemax and increasing vm.anonmin should
guarantee more memory can be dedicated to processes needing it as
opposed to allowing file caching to take over.

I think in general the vm.*max limits (except maybe vm.filemax) are only
really interesting on very small memory systems and/or on systems with
very specific types of uses which might demand more pages of one
category or the other.  The default vm.filemax value on the other hand
may be too high for systems that don't _constantly_ do a lot of file I/O
_and_ access many of the same files more than once.

So if you regularly run large processes that don't necessarily do a
whole lot of file I/O then you want to reduce vm.filemax, perhaps quite
a lot, maybe even to just being barely above vm.filemin; and of course
you want to increase vm.anonmin.  One early guide suggested (with my
comments):

vm.execmin=2# this is too low if your progs are huge code
vm.execmax=4# but this should probably be as much as 20
vm.filemin=0
vm.filemax=1# too low for compiling, web serving, etc.
vm.anonmin=70
vm.anonmax=95

Note that increasing vm.anonmin won't dedicate memory to anon pages if
they're not currently needed of course, but it will guarantee at least
that much memory will be made available, and kept available, when and if
pressure for anon pages increases.

So all of these limits are not "hard limits", nor are they dedicated
allocations per-se.  A given category can use more pages than its max
limit, at least until some other category experiences pressure,
i.e. until the page daemon is woken.

(Just keep in mind that one cannot currently exceed 95% as the sum of
the lower (vm.*min) limits.  The total of the upper (vm.*max) limits can
be more than 100%, but there are caveats to such a state.)

Also if you have a really large memory machine and you don't have
processes that wander through huge numbers of files, then you might also
want to lower vm.bufcache so that it's not wasted.

--
Greg A. Woods 

Kelowna, BC +1 250 762-7675   RoboHack 
Planix, Inc.  Avoncote Farms 


pgp6P_y72diVe.pgp
Description: OpenPGP Digital Signature


Re: UVM behavior under memory pressure

2021-04-01 Thread Manuel Bouyer
On Thu, Apr 01, 2021 at 01:13:05PM -0700, Greg A. Woods wrote:
> At Thu, 1 Apr 2021 21:03:37 +0200, Manuel Bouyer  
> wrote:
> Subject: UVM behavior under memory pressure
> >
> > Of course the system is very slow
> > Shouldn't UVM choose, in this case, to reclaim pages from the file cache
> > for the process data ?
> > I'm using the default vm.* sysctl values.
> 
> I almost never use the default vm.* values.
> 
> I would guess the main problem for your system's memory requirements, at
> the time you showed it, is that the default for vm.anonmin is way too
> low and so raising vm.anonmin might help.  If vm.anonmin isn't high
> enough then the pager won't sacrifice other requirements already in play
> for anon pages.

Yes, I understand this. But, in an emergency situation like this one (there
is no free ram, swap is full, openscad eventually gets killed),
I would expect the pager to reclaim pages where it can;
like file cache (down to vm.filemin, I agree it shouldn't go down to 0).

In my case, vm.anonmax is at 80%, and I suspect it was not reached
(I tried to increase it to 90% but this didn't change anything).

I don't know what was in the file cache; in the mean time, its usage is down
to 39M. Maybe firefox was had some background maintenance running ...
And now openscad can complete its rendering :)

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: UVM behavior under memory pressure

2021-04-01 Thread Greg A. Woods
At Thu, 1 Apr 2021 21:03:37 +0200, Manuel Bouyer  wrote:
Subject: UVM behavior under memory pressure
>
> Of course the system is very slow
> Shouldn't UVM choose, in this case, to reclaim pages from the file cache
> for the process data ?
> I'm using the default vm.* sysctl values.

I almost never use the default vm.* values.

I would guess the main problem for your system's memory requirements, at
the time you showed it, is that the default for vm.anonmin is way too
low and so raising vm.anonmin might help.  If vm.anonmin isn't high
enough then the pager won't sacrifice other requirements already in play
for anon pages.

Lowering vm.filemax (and maybe also vm.filemin) might also help since
your system, at that time, appeared to be doing far less I/O on large
numbers of files than, say, a web server or a compile server might be
doing.  However with almost 3G dedicated to the file cache it would seem
your system did recently trawl through a lot of file data, and so with a
lower vm.filemax less of it would have been kept as pressure for other
types of memory increased.

Here are the values I use, with comments about why, from my default
/etc/sysctl.conf.  These have worked reasonably well for me for years,
though I did have a virtual machine struggle to do some builds when I
ran too many make jobs in parallel and then a gargantuan compiler job
came along and needed too much memory.  However there was enough swap
and eventually it thrashed its way through, and more importantly I was
still able to run commands, albeit slowly, and my one large interactive
process (emacs), sometimes took quite a while to wake up and respond.

# N.B.:  On a live system make sure to order changes to these values so that you
# always lower any values from their default first, and then raise any that are
# to be raised above their defaults.  This way, the sum of the minimums will
# stay within the 95% limit.

# the minimum percentage of memory always (made) available for the
# file data cache
#
# The default is 10, which is much too high, even for a large-memory
# system...
#
vm.filemin=5

# the maximum percentage of memory that will be reclaimed from other uses for
# file data cache
#
# The default is 50, which may be too high for small-memory systems but may be
# about right for large-memory systems...
#
#vm.filemax=25

# the minimum percentage of memory always (made) available for anonymous pages
#
# The default is 10, which is way too low...
#
vm.anonmin=40

# the maximum percentage of memory that will be reclaimed from other uses for
# anonymous pages
#
# The default is 80, which seems just about right, but then again it's unlikely
# that the majority of inactive anonymous pages will ever be reactivated so
# maybe this should be lowered?
#
#vm.anonmax=80

# the minimum percentage of memory always (made) available for text pages
#
# The default is 5, which may be far too low on small-RAM systems...
#
vm.execmin=20

# the maximum percentage of memory that will be reclaimed from other uses for
# text pages
#
# The default is 30, which may be too low, esp. for big programs on small-memory
# systems...
#
vm.execmax=40

# It may also be useful to set the bufmem high-water limit to a number which may
# actually be less than 5% (vm.bufcache / options BUFCACHE) on large-memory
# systems (as BUFCACHE cannot be set below 5%).
#
# note this value is given in bytes.
#
#vm.bufmem_hiwater=


--
Greg A. Woods 

Kelowna, BC +1 250 762-7675   RoboHack 
Planix, Inc.  Avoncote Farms 


pgpSINIeXL6Sx.pgp
Description: OpenPGP Digital Signature


Re: UVM behavior under memory pressure

2021-04-01 Thread Mouse
> on a system running netbsd-9 from mid-august, I ended up in a state
> where the system has little free memory, no free swap and almost 40%
> of RAM used by the file cache:

> Memory: 4987M Act, 2436M Inact, 123M Wired, 198M Exec, 2918M File, 4216K Free
> Swap: 520M Total, 520M Used, 4K Free

>   PID USERNAME PRI NICE   SIZE   RES STATE  TIME   WCPUCPU COMMAND
> 0 root   00 0K   30M CPU/3 11:27  0.00%   122% [system]
>  1009 bouyer430  2978M  275M parked/0   9:32 27.78% 27.78% firefox68
>  1519 bouyer260  2583M  141M RUN/0  0:37 20.90% 20.90% firefox68
>  1243 bouyer250  2912M  350M RUN/2  2:56 60.96% 20.12% firefox68

> Shouldn't UVM choose, in this case, to reclaim pages from the file
> cache for the process data ?

It strikes me as possible, at least, that it *is* doing that, only to
have more file data push them out again - essentially, that process
pages and file-cache pages are thrashing against one another.  Do you
have any particular reason to think it isn't?  I do note that your CPU
states (which I cut, above) show over 95% system time, which is one of
the things I would expect if that's what's happening.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B