Hi Thomas,
On 17/06/2021 4:41 pm, Thomas Stuefe wrote:
The glibc is somewhat notorious for retaining released C Heap memory: calling
free(3) returns memory to the glibc, and most libc variants will return at
least a portion of it back to the Operating System, but the glibc often does
not.
This depends on the granularity of the allocations and a number of other
factors, but we found that many small allocations in particular may cause the
process heap segment (hence RSS) to get bloaty. This can cause the VM to not
recover from C-heap usage spikes.
The glibc offers an API, "malloc_trim", which can be used to cause the glibc to
return free'd memory back to the Operating System.
This may cost performance, however, and therefore I hesitate to call
malloc_trim automatically. That may be an idea for another day.
Instead of an automatic trim I propose to add a jcmd which allows to manually
trigger a libc heap trim. Such a command would have two purposes:
- when analyzing cases of high memory footprint, it allows to distinguish
"real" footprint, e.g. leaks, from a cases where the glibc just holds on to
memory
- as a stop gap measure it allows to release pressure from a high footprint
scenario.
Note that this command also helps with analyzing libc peaks which had nothing
to do with the VM - e.g. peaks created by customer code which just happens to
share the same process as the VM. Such memory does not even have to show up in
NMT.
I propose to introduce this command for Linux only. Other OSes (apart maybe
AIX) do not seem to have this problem, but Linux is arguably important enough
in itself to justify a Linux specific jcmd.
Is it perhaps worthwhile trying to generalize this to a jcmd to request
an attempt to release system resources and then each platform can do
whatever may be available to assist in that - including doing nothing,
or in this case trimming the glibc heap ?
Thanks,
David
-----
If this finds agreement, I will file a CSR.
=========
This patch:
- introduces a new jcmd, "VM.trim_libc_heap", no arguments, which trims the
glibc heap on glibc platforms.
- includes a (rather basic) test
- the command calls malloc_trim(3), and additionally prints out its effect
(changes caused in virt size, rss and swap space)
- I refactored some code in os_linux.cpp to factor out scanning
/proc/self/status to get kernel memory information.
=========
Example:
A programm causes a temporary peak in C-heap usage (in this case, triggered via
Unsafe.allocateMemory), right away frees the memory again, so its not leaky.
The peak in RSS was ~8G (even though the user allocation was way smaller -
glibc has a lot of overhead). The effects of this peak linger even after
returning that memory to the glibc:
thomas@starfish:~$ jjjcmd AllocCHeap VM.info | grep Resident
Resident Set Size: 8685896K (peak: 8685896K) (anon: 8648680K, file: 37216K,
shmem: 0K)
^^^^^^^^
We execute the new trim command via jcmd:
thomas@starfish:~$ jjjcmd AllocCHeap VM.trim_libc_heap
18770:
Attempting trim...
Done.
Virtual size before: 28849744k, after: 28849724k, (-20k)
RSS before: 8685896k, after: 920740k, (-7765156k) <<<<
Swap before: 0k, after: 0k, (0k)
It prints out reduction in virtual size, rss and swap. The virtual size did not
decrease since no mappings had been unmapped by the glibc. However, the process heap
was shrunk heavily by the glibc, resulting in a large drop in RSS (8.5G->900M),
freeing >7G of memory:
thomas@starfish:~$ jjjcmd AllocCHeap VM.info | grep Resident
Resident Set Size: 920740K (peak: 8686004K) (anon: 883460K, file: 37280K,
shmem: 0K)
^^^^^^^
When the VM is started with -Xlog:os, this is also logged:
[139,068s][info][os] malloc_trim:
[139,068s][info][os] Virtual size before: 28849744k, after: 28849724k, (-20k)
RSS before: 8685896k, after: 920740k, (-7765156k)
Swap before: 0k, after: 0k, (0k)
-------------
Commit messages:
- start
Changes: https://git.openjdk.java.net/jdk/pull/4510/files
Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4510&range=00
Issue: https://bugs.openjdk.java.net/browse/JDK-8268893
Stats: 237 lines in 6 files changed: 212 ins; 7 del; 18 mod
Patch: https://git.openjdk.java.net/jdk/pull/4510.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/4510/head:pull/4510
PR: https://git.openjdk.java.net/jdk/pull/4510