We just began stress testing our application,
and found memory leaks. Eventually, I isolated the leak to the
NDC push/pop that we were doing in each thread that was created. Then,
after looking more carefully at the NDC class documentation, I saw the
following regarding the remove method:
"Remove the diagnostic context for this thread.
Each thread that created a diagnostic context by calling pop()
should call this method before exiting. Otherwise, the memory used by the
diagnostic context for the thread cannot be reclaimed by the VM.
As this is such an important problem in heavy duty systems and because
it is difficult to always guarantee that the remove method is called before
exiting a thread, this method has been augmented to lazily remove references
to dead threads. In practice, this means that you can be a little sloppy and
occasionally forget to call
remove()
before exiting a thread"
I had not known about the remove method,
probably because I initially looked only at a few examples, such as
Trivial.java, and the remove was not used there. I also noticed that
the documentation for the push and pop methods does not mention the remove()
method at all.
Furthermore, according to the remove()
documentation, we should be OK anyway because of the lazy cleanup mechanism,
but according to our tests, it did not clean up enough, as we eventually ran
out of memory.
So, if anyone else is currently doing push/pop
in threads that come and go frequently, be sure to call remove as the thread
exits... and perhaps the remove method could be mentioned in the
documentation for push and pop.
Wes