For the past few months we’ve been trying to trace what looks like gradual memory creep. After some long-running experiments it seems due to memory leaking when jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) is invoked. Somewhere.
My environment is Tomcat running a proxy webapp. It does TLS termination, authentication and then forwards the call to local services. It doesn’t do much else, it’s a relatively small application. Some (possibly relevant) versions and config parameters: Tomcat 8.5 Java 8u241 (Oracle) Heap size = 360Mb MAX_ALLOC_ARENA=2 MALLOC_TRIM_THRESHOLD_=250048 jdk.nio.maxCachedBufferSize=25600 We couldn’t find any proof of memory leaking on the Java side. When we turn on NativeMemoryTracking=detail and we take a snapshot shortly after starting, we see (just one block shown): [0x000003530e462f9a] JNIHandleBlock::allocate_block(Thread*)+0xaa [0x000003530e3f759a] JavaCallWrapper::JavaCallWrapper(methodHandle, Handle, JavaValue*, Thread*)+0x6a [0x000003530e3fa000] JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x8f0 [0x000003530e4454a1] jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .isra.96] [clone .constprop.117]+0x1e1 (malloc=33783KB type=Internal #110876) Then we run it under heavy load for a few weeks and take another snapshot: [0x000003530e462f9a] JNIHandleBlock::allocate_block(Thread*)+0xaa [0x000003530e3f759a] JavaCallWrapper::JavaCallWrapper(methodHandle, Handle, JavaValue*, Thread*)+0x6a [0x000003530e3fa000] JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x8f0 [0x000003530e4454a1] jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) [clone .isra.96] [clone .constprop.117]+0x1e1 (malloc=726749KB type=Internal #2385226) While other blocks also show some variation, none show growth like this one. When I do some math on the number (726749KB - 33783KB) / (2385226 – 110876) it comes down to a pretty even 312 bytes per allocation. And we leaked just under 700Mb. While not immediately problematic, this does not bode well for our customers who run this service for months. I’d like to avoid telling them they need to restart this service every two weeks to reclaim memory. Has anyone seen something like this? Any way it could be avoided? Mark Boon