Hi, Thank you for all the help.
I added the code suggested by Paul and ran my small microbenchmark. The optimized getThreadAllocatedBytes now takes 57 ns if fed with the ID of the current thread. All calls that aren't for the current thread will be a bit slower. As far as I am concerned, adding this optimization probably doesn't hurt. In fact it would be awesome if this could be backported to Java 8. But I am still strongly in favor of adding a special method just for the current thread: * It is still faster (35 ns vs 57 ns) * No heap memory is allocated by this method * If this method is available, callers can always expect good (constant time) performance. On the other hand two versions would exist for getThreadAllocatedBytes. Library code would have to figure out somehow if this particular JVM has the slow or fast version. In my own use case, I would only want to get the metric if it is extremely fast. It's not worth the overhead, if it takes thousands of nanoseconds. I wasn't aware of the existence of JavaThread::current() before. My new method in management.cpp could be simplified to this: JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env)) return JavaThread::current()->cooked_allocated_bytes(); JVM_END This code is not only simpler but also a bit faster. Each call now takes 35 ns instead of 37 ns. Is there a technical reason why no new native methods should be added to ThreadImpl.java? I saw that getCurrentThreadCpuTime also uses this magic number 0 to indicate the current thread. Wouldn't it be cleaner to have two methods instead of using and checking a special number in Java/native code? Best regards, Markus