For the past few months we’ve been trying to trace what looks like gradual 
memory creep. After some long-running experiments it seems due to memory 
leaking when
jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, 
JNI_ArgumentPusher*, Thread*) is invoked. Somewhere.

My environment is Tomcat running a proxy webapp. It does TLS termination,  
authentication and then forwards the call to local services. It doesn’t do much 
else, it’s a relatively small application.

Some (possibly relevant) versions and config parameters:
Tomcat 8.5
Java 8u241 (Oracle)
Heap size = 360Mb
MAX_ALLOC_ARENA=2
MALLOC_TRIM_THRESHOLD_=250048
jdk.nio.maxCachedBufferSize=25600

We couldn’t find any proof of memory leaking on the Java side.
When we turn on NativeMemoryTracking=detail and we take a snapshot shortly 
after starting, we see (just one block shown):

[0x000003530e462f9a] JNIHandleBlock::allocate_block(Thread*)+0xaa
[0x000003530e3f759a] JavaCallWrapper::JavaCallWrapper(methodHandle, Handle, 
JavaValue*, Thread*)+0x6a
[0x000003530e3fa000] JavaCalls::call_helper(JavaValue*, methodHandle*, 
JavaCallArguments*, Thread*)+0x8f0
[0x000003530e4454a1] jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, 
JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .isra.96] [clone 
.constprop.117]+0x1e1
                             (malloc=33783KB type=Internal #110876)

Then we run it under heavy load for a few weeks and take another snapshot:

[0x000003530e462f9a] JNIHandleBlock::allocate_block(Thread*)+0xaa
[0x000003530e3f759a] JavaCallWrapper::JavaCallWrapper(methodHandle, Handle, 
JavaValue*, Thread*)+0x6a
[0x000003530e3fa000] JavaCalls::call_helper(JavaValue*, methodHandle*, 
JavaCallArguments*, Thread*)+0x8f0
[0x000003530e4454a1] jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, 
JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .isra.96] [clone 
.constprop.117]+0x1e1
                             (malloc=726749KB type=Internal #2385226)

While other blocks also show some variation, none show growth like this one. 
When I do some math on the number (726749KB - 33783KB) / (2385226 – 110876) it 
comes down to a pretty even 312 bytes per allocation.
And we leaked just under 700Mb. While not immediately problematic, this does 
not bode well for our customers who run this service for months.

I’d like to avoid telling them they need to restart this service every two 
weeks to reclaim memory. Has anyone seen something like this? Any way it could 
be avoided?

    Mark Boon



Reply via email to