cshannon commented on issue #4783:
URL: https://github.com/apache/accumulo/issues/4783#issuecomment-2293799324

   I started to investigate this more today because I was looking to see what 
size limit might be appropriate. I applied the sample patch here to generate 
the OOM heap dumps and I noticed that when I tried limiting the cache size to 
something small, the test was still generating OOM errors which was pretty 
weird. 
   
   I went ahead and loaded up the heap dumps using the Eclipse memory analyzer 
and took a look and I discovered that the memory leak in this case had nothing 
to do with the cache inside of VolumeManagerImpl. There were a bunch of 
Configuration objects with a weak reference hash map which was surprising and 
looking into it more I discovered the source of the memory leak in this case 
was actually because the hadoop Configuration object 
[registers](https://github.com/apache/hadoop/blob/f00094203bf40a8c3f2216cf22eaa5599e3b9b4d/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L834)
 all the new objects in a weak reference [hash 
map](https://github.com/apache/hadoop/blob/f00094203bf40a8c3f2216cf22eaa5599e3b9b4d/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L325)
 inside it's constructor. In this case, the memory leak generated by the test 
modifications didn't cause a leak due to the `VolumeManagerImpl` c
 ache. 
   
   Thinking about the changes here to the testing, this behavior makes sense 
because `TestAmple` doesn't use that cache. It's just 
[creating](https://github.com/apache/accumulo/blob/d8185cdea742b00c17b2877f6198fb2a8f73a7ef/test/src/main/java/org/apache/accumulo/test/ample/metadata/TestAmple.java#L243)
 a new ServerContext each time a new TestAmple is loaded which in turn will end 
up creating a new config
   
   @keith-turner - So after finding this I was curious if this was actually the 
memory leak all along, however reading over the issue again, you said that you 
analyzed the heap dump and saw the objects were attached to the 
`VolumeManagerImpl` cache. If that is the case then I'm assuming that means the 
way we are trying to reproduce this bug here is actually not correct, and the 
OOM error being generated is similar (too many Configuration objects in memory) 
but not exactly the same as the large number of Configuration objects generated 
by TestAmple to cause the leak are not being stored in the VolumeManagerImpl 
but instead being referenced by the Configuration object itself.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to