Re: [SOLVED] Re: Strange behavior after running under high load

2021-04-04 Thread Poul-Henning Kamp

Konstantin Belousov writes:

> > B) We lack a nuanced call-back to tell the subsystems to release some of 
> > their memory "without major delay".

> The delay in the wall clock sense does not drive the issue.

I didnt say anything about "wall clock" and you're missing my point by a wide 
margin.

We need to make major memory consumers, like vnodes take action *before* 
shortages happen, so that *when* they happen, a lot of memory can be released 
to relive them.

> We cannot expect any io to proceed while we are low on memory [...]

Which is precisely why the top level goal should be for that to never happen, 
while still allowing the freeable" memory to be used as a cache as much as 
possible.

> > C) We have never attempted to enlist userland, where jemalloc often hang on 
> > to a lot of unused VM pages.
> > 
> The userland does not add to this problem, [...]

No, but userland can help solve it:  The unused pages from jemalloc/userland 
can very quickly be released to relieve any imminent shortage the kernel might 
have.

As can pages from vnodes, and for that matter socket buffers.

But there are always costs, actual costs, ie: what it will take to release the 
memory (locking, VM mappings, washing) and potential costs (lack of future 
caching opportunities).

These costs need to be presented to the central memory allocator, so when it 
decides back-pressure is appropriate, it can decide who to punk for how much 
memory.

> But normally operating system does not have an issue with user pages.  

Only if you disregard all non-UNIX operating systems.

Many other kernels have cooperated with userland to balance memory (and for 
that matter disk-space).

Just imagine how much better the desktop experience would be, if we could send 
SIGVM to firefox to tell it stop being a memory-pig.

(At least two of the major operating systems in the desktop world does 
something like that today.)

> Io latency is not the factor there. We must avoid situations where
> instantiating a vnode stalls waiting for KVA to appear, similarly we
> must avoid system state where vnodes allocation consumed so much kmem
> that other allocations stall.

My argument is the precise opposite:  We must make vnodes and the allocations 
they cause responsive to the sytems overall memory availability, well in 
advance of the shortage happening in the first place.

> Quite indicative is that we do not shrink the vnode list on low memory
> events.  Vnlru also does not account for the memory pressure.

The only reason we do not, is that we cannot tell definitively if freeing a 
vnode will cause disk-I/O (which may not matter with SSD's) or even how much 
memory it might free, if anything.

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [SOLVED] Re: Strange behavior after running under high load

2021-04-04 Thread Konstantin Belousov
On Sun, Apr 04, 2021 at 07:01:44PM +, Poul-Henning Kamp wrote:
> 
> Konstantin Belousov writes:
> 
> > But what would you provide as the input for PID controller, and what would 
> > be the targets?
> 
> Viewing this purely as a vnode related issue is wrong, this is about memory 
> allocation in general.
> 
> We may or may not want a PID regulator, but putting it on counts of vnode 
> would not improve things, precisely, as you point out, because the amount of 
> memory a vnode ties up has enormous variance.
> 
Yes

> 
> We should focus on the end goal: To ensure "sufficient" memory can always be 
> allocated for any purpose "without major delay".
> 
and no

> 
> Architecturally there are three major problems:
> 
> A) While each subsystem generally have a good idea about memory that can be 
> released "without major delay", the information does not trickle up through a 
> summarizing NUMA aware tree.
> 
> B) We lack a nuanced call-back to tell the subsystems to release some of 
> their memory "without major delay".
The delay in the wall clock sense does not drive the issue.
We cannot expect any io to proceed while we are low on memory, in the sense
that allocators cannot respond right now.  More and more, our io subsystem
requires allocating memory to make any progress with io.  This is already
quite bad with geom, although some hacks make it not too outstanding.

It is very bad with ZFS, where swap on zvols causes deadlocks almost
immediately.

> 
> C) We have never attempted to enlist userland, where jemalloc often hang on 
> to a lot of unused VM pages.
> 
The userland does not add to this problem, because pagedaemon typically has
enough processing power to convert user-allocated pages into usable clean
or free pages.  Of course, if there is no swap and dirty anon page cannot
be launder, the issue would accumulate.

But normally operating system does not have an issue with user pages.  

> 
> As far as vnodes go:
> 
> 
> It used to be that "without major delay" meant "without disk-I/O" which again 
> led to the "dirty buffers/VM pages" heuristic.
> 
> With microsecond SSD backing store, that heuristic is not only invalid, it is 
> down-right harmful in many cases.
> 
> GEOM maintains estimates of per-provider latency and VM+VFS should use that 
> to schedule write-back so that more of it happens outside rush-hour, in order 
> to increase the amount of memory which can be released "without major delay".
> 
> Today that happens largely as a side effect of the periodic syncer, which 
> does a really bad job at it, because it still expects VAX-era hardware 
> performance and workloads.
> 
Io latency is not the factor there. We must avoid situations where
instantiating a vnode stalls waiting for KVA to appear, similarly we
must avoid system state where vnodes allocation consumed so much kmem
that other allocations stall.

Quite indicative is that we do not shrink the vnode list on low memory
events.  Vnlru also does not account for the memory pressure.

Problem is that it is not clear how to express that relations between
safe allocators state and our desire to cache file system data, which is
bound to the vnode identity.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [SOLVED] Re: Strange behavior after running under high load

2021-04-04 Thread Poul-Henning Kamp

Konstantin Belousov writes:

> But what would you provide as the input for PID controller, and what would be 
> the targets?

Viewing this purely as a vnode related issue is wrong, this is about memory 
allocation in general.

We may or may not want a PID regulator, but putting it on counts of vnode would 
not improve things, precisely, as you point out, because the amount of memory a 
vnode ties up has enormous variance.


We should focus on the end goal: To ensure "sufficient" memory can always be 
allocated for any purpose "without major delay".


Architecturally there are three major problems:

A) While each subsystem generally have a good idea about memory that can be 
released "without major delay", the information does not trickle up through a 
summarizing NUMA aware tree.

B) We lack a nuanced call-back to tell the subsystems to release some of their 
memory "without major delay".

C) We have never attempted to enlist userland, where jemalloc often hang on to 
a lot of unused VM pages.


As far as vnodes go:


It used to be that "without major delay" meant "without disk-I/O" which again 
led to the "dirty buffers/VM pages" heuristic.

With microsecond SSD backing store, that heuristic is not only invalid, it is 
down-right harmful in many cases.

GEOM maintains estimates of per-provider latency and VM+VFS should use that to 
schedule write-back so that more of it happens outside rush-hour, in order to 
increase the amount of memory which can be released "without major delay".

Today that happens largely as a side effect of the periodic syncer, which does 
a really bad job at it, because it still expects VAX-era hardware performance 
and workloads.


-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [SOLVED] Re: Strange behavior after running under high load

2021-04-04 Thread Konstantin Belousov
On Sun, Apr 04, 2021 at 08:45:41AM -0600, Warner Losh wrote:
> On Sun, Apr 4, 2021, 5:51 AM Mateusz Guzik  wrote:
> 
> > On 4/3/21, Poul-Henning Kamp  wrote:
> > > 
> > > Mateusz Guzik writes:
> > >
> > >> It is high because of this:
> > >> msleep(_sig, _list_mtx, PVFS, "vlruwk",
> > >> hz);
> > >>
> > >> i.e. it literally sleeps for 1 second.
> > >
> > > Before the line looked like that, it slept on "lbolt" aka "lightning
> > > bolt" which was woken once a second.
> > >
> > > The calculations which come up with those "constants" have always
> > > been utterly bogus math, not quite "square-root of shoe-size
> > > times sun-angle in Patagonia", but close.
> > >
> > > The original heuristic came from university environments with tons of
> > > students doing assignments and nethack behind VT102 terminals, on
> > > filesystems where files only seldom grew past 100KB, so it made sense
> > > to scale number of vnodes to how much RAM was in the system, because
> > > that also scaled the size of the buffer-cache.
> > >
> > > With a merged VM buffer-cache, whatever validity that heuristic had
> > > was lost, and we tweaked the bogomath in various ways until it
> > > seemed to mostly work, trusting the users for which it did not, to
> > > tweak things themselves.
> > >
> > > Please dont tweak the Finagle Constants again.
> > >
> > > Rip all that crap out and come up with something fundamentally better.
> > >
> >
> > Some level of pacing is probably useful to control total memory use --
> > there can be A LOT of memory tied up in mere fact that vnode is fully
> > cached. imo the thing to do is to come up with some watermarks to be
> > revisited every 1-2 years and to change the behavior when they get
> > exceeded -- try to whack some stuff but in face of trouble just go
> > ahead and alloc without sleep 1. Should the load spike sort itself
> > out, vnlru will slowly get things down to the watermark. If the
> > watermark is too low, maybe it can autotune. Bottom line is that even
> > with the current idea of limiting preferred total vnode count, the
> > corner case behavior can be drastically better suffering SOME perf
> > loss from recycling vnodes, but not sleeping for a second for every
> > single one.
> >
> 
> I'd suggest that going directly to a PID to control this would be better
> than the watermarks. That would give a smoother response than high/low
> watermarks would. While you'd need some level to keep things at still, the
> laundry stuff has shown the precise level of that level is less critical
> than the watermarks.
But what would you provide as the input for PID controller, and what
would be the targets?

The main reason for the (almost) hard cap on the number of vnodes is not
that excessive number of vnodes is harmful by itself.  Each allocated
vnode typically implies existence of several second-order allocations
that accumulate into significant KVA usage:
- filesystem inode
- vm object
- namecache entries
There are usually even more allocations, third-order, for instance UFS
inode carries a pointer to the dinode copy in RAM, and possibly EA area.
And of course, the fact that vnode names pages in the page cache owned by
corresponding file, i.e. amount of allocated vnodes regulates amount of
work for pagedaemon.

We currently trying to put some rational limit for total number of vnodes,
estimating both KVA and physical memory consumed by them.  If you remove
that limit, you need to ensure that we do not create OOM situation either
for KVA or for physical memory just by creating too many vnodes, otherwise
system cannot get out of it.

So there are some combinations of machine config (RAM) and loads where 
default settings are arguably low.  Raising the limits needs to handle
the indirect resource usage from vnode.

I do not know how to write the feedback formula, taking into account all
the consequences of the vnode existence, and that effects depend also on
the underlying filesystem and patterns of VM paging usage.  In this sense
ZFS is probably simplest case, because its caching subsystem is autonomous.
While UFS or NFS are tightly integrated with VM.

> 
> Warner
> 
> I think the notion of 'struct vnode' being a separately allocated
> > object is not very useful and it comes with complexity (and happens to
> > suffer from several bugs).
> >
> > That said, the easiest and safest thing to do in the meantime is to
> > bump the limit. Perhaps the sleep can be whacked as it is which would
> > largely sort it out.
> >
> > --
> > Mateusz Guzik 
> > ___
> > freebsd-current@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> >
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to 

Re: [SOLVED] Re: Strange behavior after running under high load

2021-04-04 Thread Warner Losh
On Sun, Apr 4, 2021, 5:51 AM Mateusz Guzik  wrote:

> On 4/3/21, Poul-Henning Kamp  wrote:
> > 
> > Mateusz Guzik writes:
> >
> >> It is high because of this:
> >> msleep(_sig, _list_mtx, PVFS, "vlruwk",
> >> hz);
> >>
> >> i.e. it literally sleeps for 1 second.
> >
> > Before the line looked like that, it slept on "lbolt" aka "lightning
> > bolt" which was woken once a second.
> >
> > The calculations which come up with those "constants" have always
> > been utterly bogus math, not quite "square-root of shoe-size
> > times sun-angle in Patagonia", but close.
> >
> > The original heuristic came from university environments with tons of
> > students doing assignments and nethack behind VT102 terminals, on
> > filesystems where files only seldom grew past 100KB, so it made sense
> > to scale number of vnodes to how much RAM was in the system, because
> > that also scaled the size of the buffer-cache.
> >
> > With a merged VM buffer-cache, whatever validity that heuristic had
> > was lost, and we tweaked the bogomath in various ways until it
> > seemed to mostly work, trusting the users for which it did not, to
> > tweak things themselves.
> >
> > Please dont tweak the Finagle Constants again.
> >
> > Rip all that crap out and come up with something fundamentally better.
> >
>
> Some level of pacing is probably useful to control total memory use --
> there can be A LOT of memory tied up in mere fact that vnode is fully
> cached. imo the thing to do is to come up with some watermarks to be
> revisited every 1-2 years and to change the behavior when they get
> exceeded -- try to whack some stuff but in face of trouble just go
> ahead and alloc without sleep 1. Should the load spike sort itself
> out, vnlru will slowly get things down to the watermark. If the
> watermark is too low, maybe it can autotune. Bottom line is that even
> with the current idea of limiting preferred total vnode count, the
> corner case behavior can be drastically better suffering SOME perf
> loss from recycling vnodes, but not sleeping for a second for every
> single one.
>

I'd suggest that going directly to a PID to control this would be better
than the watermarks. That would give a smoother response than high/low
watermarks would. While you'd need some level to keep things at still, the
laundry stuff has shown the precise level of that level is less critical
than the watermarks.

Warner

I think the notion of 'struct vnode' being a separately allocated
> object is not very useful and it comes with complexity (and happens to
> suffer from several bugs).
>
> That said, the easiest and safest thing to do in the meantime is to
> bump the limit. Perhaps the sleep can be whacked as it is which would
> largely sort it out.
>
> --
> Mateusz Guzik 
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [SOLVED] Re: Strange behavior after running under high load

2021-04-04 Thread Mateusz Guzik
On 4/3/21, Poul-Henning Kamp  wrote:
> 
> Mateusz Guzik writes:
>
>> It is high because of this:
>> msleep(_sig, _list_mtx, PVFS, "vlruwk",
>> hz);
>>
>> i.e. it literally sleeps for 1 second.
>
> Before the line looked like that, it slept on "lbolt" aka "lightning
> bolt" which was woken once a second.
>
> The calculations which come up with those "constants" have always
> been utterly bogus math, not quite "square-root of shoe-size
> times sun-angle in Patagonia", but close.
>
> The original heuristic came from university environments with tons of
> students doing assignments and nethack behind VT102 terminals, on
> filesystems where files only seldom grew past 100KB, so it made sense
> to scale number of vnodes to how much RAM was in the system, because
> that also scaled the size of the buffer-cache.
>
> With a merged VM buffer-cache, whatever validity that heuristic had
> was lost, and we tweaked the bogomath in various ways until it
> seemed to mostly work, trusting the users for which it did not, to
> tweak things themselves.
>
> Please dont tweak the Finagle Constants again.
>
> Rip all that crap out and come up with something fundamentally better.
>

Some level of pacing is probably useful to control total memory use --
there can be A LOT of memory tied up in mere fact that vnode is fully
cached. imo the thing to do is to come up with some watermarks to be
revisited every 1-2 years and to change the behavior when they get
exceeded -- try to whack some stuff but in face of trouble just go
ahead and alloc without sleep 1. Should the load spike sort itself
out, vnlru will slowly get things down to the watermark. If the
watermark is too low, maybe it can autotune. Bottom line is that even
with the current idea of limiting preferred total vnode count, the
corner case behavior can be drastically better suffering SOME perf
loss from recycling vnodes, but not sleeping for a second for every
single one.

I think the notion of 'struct vnode' being a separately allocated
object is not very useful and it comes with complexity (and happens to
suffer from several bugs).

That said, the easiest and safest thing to do in the meantime is to
bump the limit. Perhaps the sleep can be whacked as it is which would
largely sort it out.

-- 
Mateusz Guzik 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [SOLVED] Re: Strange behavior after running under high load

2021-04-03 Thread Poul-Henning Kamp

Mateusz Guzik writes:

> It is high because of this:
> msleep(_sig, _list_mtx, PVFS, "vlruwk", hz);
>
> i.e. it literally sleeps for 1 second.

Before the line looked like that, it slept on "lbolt" aka "lightning
bolt" which was woken once a second.

The calculations which come up with those "constants" have always
been utterly bogus math, not quite "square-root of shoe-size
times sun-angle in Patagonia", but close.

The original heuristic came from university environments with tons of
students doing assignments and nethack behind VT102 terminals, on
filesystems where files only seldom grew past 100KB, so it made sense
to scale number of vnodes to how much RAM was in the system, because
that also scaled the size of the buffer-cache.

With a merged VM buffer-cache, whatever validity that heuristic had
was lost, and we tweaked the bogomath in various ways until it
seemed to mostly work, trusting the users for which it did not, to
tweak things themselves.

Please dont tweak the Finagle Constants again.

Rip all that crap out and come up with something fundamentally better.

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [SOLVED] Re: Strange behavior after running under high load

2021-04-02 Thread Mateusz Guzik
On 4/2/21, Stefan Esser  wrote:
> Am 28.03.21 um 16:39 schrieb Stefan Esser:
>> After a period of high load, my now idle system needs 4 to 10 seconds to
>> run any trivial command - even after 20 minutes of no load ...
>>
>>
>> I have run some Monte-Carlo simulations for a few hours, with initially
> 35
>> processes running in parallel for some 10 seconds each.
>>
>> The load decreased over time since some parameter sets were faster to
>> process.
>> All in all 63000 processes ran within some 3 hours.
>>
>> When the system became idle, interactive performance was very bad.
>> Running
>> any trivial command (e.g. uptime) takes some 5 to 10 seconds. Since I
>> have
>> to have this system working, I plan to reboot it later today, but will
>> keep
>> it in this state for some more time to see whether this state persists or
>> whether the system recovers from it.
>>
>> Any ideas what might cause such a system state???
>
> Seems that Mateusz Guzik was right to mention performance issues when
> the system is very low on vnodes. (Thanks!)
>
> I have been able to reproduce the issue and have checked vnode stats:
>
> kern.maxvnodes: 620370
> kern.minvnodes: 155092
> vm.stats.vm.v_vnodepgsout: 6890171
> vm.stats.vm.v_vnodepgsin: 18475530
> vm.stats.vm.v_vnodeout: 228516
> vm.stats.vm.v_vnodein: 1592444
> vfs.wantfreevnodes: 155092
> vfs.freevnodes: 47<- obviously too low ...
> vfs.vnodes_created: 19554702
> vfs.numvnodes: 621284
> vfs.cache.debug.vnodes_cel_3_failures: 0
> vfs.cache.stats.heldvnodes: 6412
>
> The freevnodes value stayed in this region over several minutes, with
> typical program start times (e.g. for "uptime") in the region of 10 to
> 15 seconds.
>
> After rising maxvnodes to 2,000,000 form 600,000 the system performance
> is restored and I get:
>
> kern.maxvnodes: 200
> kern.minvnodes: 50
> vm.stats.vm.v_vnodepgsout: 7875198
> vm.stats.vm.v_vnodepgsin: 20788679
> vm.stats.vm.v_vnodeout: 261179
> vm.stats.vm.v_vnodein: 1817599
> vfs.wantfreevnodes: 50
> vfs.freevnodes: 205988<- still a lot higher than wantfreevnodes
> vfs.vnodes_created: 19956502
> vfs.numvnodes: 912880
> vfs.cache.debug.vnodes_cel_3_failures: 0
> vfs.cache.stats.heldvnodes: 20702
>
> I do not know why the performance impact is so high - there are a few
> free vnodes (more than required for the shared libraries to start e.g.
> the uptime program). Most probably each attempt to get a vnode triggers
> a clean-up attempt that runs for a significant time, but has no chance
> to actually reach near the goal of 155k or 500k free vnodes.
>

It is high because of this:
msleep(_sig, _list_mtx, PVFS, "vlruwk", hz);

i.e. it literally sleeps for 1 second.

The vnode limit is probably too conservative and behavior when limit
is reached is rather broken. Probably the thing to do is to let
allocations go through while kicking vnlru to free some stuff up. I'll
have to sleep on it.


> Anyway, kern.maxvnodes can be changed at run-time and it is thus easy
> to fix. It seems that no message is logged to report this situation.
> A rate limited hint to rise the limit should help other affected users.
>
> Regards, STefan
>
>


-- 
Mateusz Guzik 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


[SOLVED] Re: Strange behavior after running under high load

2021-04-02 Thread Stefan Esser

Am 28.03.21 um 16:39 schrieb Stefan Esser:

After a period of high load, my now idle system needs 4 to 10 seconds to
run any trivial command - even after 20 minutes of no load ...


I have run some Monte-Carlo simulations for a few hours, with initially 
35 

processes running in parallel for some 10 seconds each.

The load decreased over time since some parameter sets were faster to process.
All in all 63000 processes ran within some 3 hours.

When the system became idle, interactive performance was very bad. Running
any trivial command (e.g. uptime) takes some 5 to 10 seconds. Since I have
to have this system working, I plan to reboot it later today, but will keep
it in this state for some more time to see whether this state persists or
whether the system recovers from it.

Any ideas what might cause such a system state???


Seems that Mateusz Guzik was right to mention performance issues when
the system is very low on vnodes. (Thanks!)

I have been able to reproduce the issue and have checked vnode stats:

kern.maxvnodes: 620370
kern.minvnodes: 155092
vm.stats.vm.v_vnodepgsout: 6890171
vm.stats.vm.v_vnodepgsin: 18475530
vm.stats.vm.v_vnodeout: 228516
vm.stats.vm.v_vnodein: 1592444
vfs.wantfreevnodes: 155092
vfs.freevnodes: 47  <- obviously too low ...
vfs.vnodes_created: 19554702
vfs.numvnodes: 621284
vfs.cache.debug.vnodes_cel_3_failures: 0
vfs.cache.stats.heldvnodes: 6412

The freevnodes value stayed in this region over several minutes, with
typical program start times (e.g. for "uptime") in the region of 10 to
15 seconds.

After rising maxvnodes to 2,000,000 form 600,000 the system performance
is restored and I get:

kern.maxvnodes: 200
kern.minvnodes: 50
vm.stats.vm.v_vnodepgsout: 7875198
vm.stats.vm.v_vnodepgsin: 20788679
vm.stats.vm.v_vnodeout: 261179
vm.stats.vm.v_vnodein: 1817599
vfs.wantfreevnodes: 50
vfs.freevnodes: 205988  <- still a lot higher than wantfreevnodes
vfs.vnodes_created: 19956502
vfs.numvnodes: 912880
vfs.cache.debug.vnodes_cel_3_failures: 0
vfs.cache.stats.heldvnodes: 20702

I do not know why the performance impact is so high - there are a few
free vnodes (more than required for the shared libraries to start e.g.
the uptime program). Most probably each attempt to get a vnode triggers
a clean-up attempt that runs for a significant time, but has no chance
to actually reach near the goal of 155k or 500k free vnodes.

Anyway, kern.maxvnodes can be changed at run-time and it is thus easy
to fix. It seems that no message is logged to report this situation.
A rate limited hint to rise the limit should help other affected users.

Regards, STefan



OpenPGP_signature
Description: OpenPGP digital signature