--------
Konstantin Belousov writes:

> > B) We lack a nuanced call-back to tell the subsystems to release some of 
> > their memory "without major delay".

> The delay in the wall clock sense does not drive the issue.

I didnt say anything about "wall clock" and you're missing my point by a wide 
margin.

We need to make major memory consumers, like vnodes take action *before* 
shortages happen, so that *when* they happen, a lot of memory can be released 
to relive them.

> We cannot expect any io to proceed while we are low on memory [...]

Which is precisely why the top level goal should be for that to never happen, 
while still allowing the freeable" memory to be used as a cache as much as 
possible.

> > C) We have never attempted to enlist userland, where jemalloc often hang on 
> > to a lot of unused VM pages.
> > 
> The userland does not add to this problem, [...]

No, but userland can help solve it:  The unused pages from jemalloc/userland 
can very quickly be released to relieve any imminent shortage the kernel might 
have.

As can pages from vnodes, and for that matter socket buffers.

But there are always costs, actual costs, ie: what it will take to release the 
memory (locking, VM mappings, washing) and potential costs (lack of future 
caching opportunities).

These costs need to be presented to the central memory allocator, so when it 
decides back-pressure is appropriate, it can decide who to punk for how much 
memory.

> But normally operating system does not have an issue with user pages.  

Only if you disregard all non-UNIX operating systems.

Many other kernels have cooperated with userland to balance memory (and for 
that matter disk-space).

Just imagine how much better the desktop experience would be, if we could send 
SIGVM to firefox to tell it stop being a memory-pig.

(At least two of the major operating systems in the desktop world does 
something like that today.)

> Io latency is not the factor there. We must avoid situations where
> instantiating a vnode stalls waiting for KVA to appear, similarly we
> must avoid system state where vnodes allocation consumed so much kmem
> that other allocations stall.

My argument is the precise opposite:  We must make vnodes and the allocations 
they cause responsive to the sytems overall memory availability, well in 
advance of the shortage happening in the first place.

> Quite indicative is that we do not shrink the vnode list on low memory
> events.  Vnlru also does not account for the memory pressure.

The only reason we do not, is that we cannot tell definitively if freeing a 
vnode will cause disk-I/O (which may not matter with SSD's) or even how much 
memory it might free, if anything.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
p...@freebsd.org         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.
_______________________________________________
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to