Re: Zookeeper does not clean up deleted nodes
are you looking at the leader or the follower? the leader keeps the last few transactions in memory to speed up syncing with new followers. that might be what you are seeing. On Mon, Apr 8, 2013 at 3:32 AM, Mathias Hodler mathias.hod...@gmail.comwrote: Hi, I made some tests and it seems like zookeeper doesn't clean up the last 500 deleted nodes. In my test I created nodes and deleted each node after it was created. I repeated this step 1000 times and then triggered a full gc. These are the results Creating 1000 Nodes and deleting 1000 Nodes and each node has... ...1000kb data = 529MB heap used after FullGC ...500kb data = 281MB heap used after FullGC ...256kb data = 140MB heap used after FullGC ...128kb data = 68MB heap used after FullGC If I'm creating 1000 nodes with each 1000kb data and then deleting the nodes and after that creating 1000 nodes with 128kb data and deleting the nodes again, 68MB heap space is used. So it seems Zookeeper caches / doesn't clean up the last 500 deleted nodes. Is this a bug or are there configuration parameter to change that behaviour?
Re: Zookeeper does not clean up deleted nodes
As Ben says, this is a feature, not a bug. However, the memory usage is still excessive; see this jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1473 Henry On 8 April 2013 09:31, Benjamin Reed br...@apache.org wrote: are you looking at the leader or the follower? the leader keeps the last few transactions in memory to speed up syncing with new followers. that might be what you are seeing. On Mon, Apr 8, 2013 at 3:32 AM, Mathias Hodler mathias.hod...@gmail.com wrote: Hi, I made some tests and it seems like zookeeper doesn't clean up the last 500 deleted nodes. In my test I created nodes and deleted each node after it was created. I repeated this step 1000 times and then triggered a full gc. These are the results Creating 1000 Nodes and deleting 1000 Nodes and each node has... ...1000kb data = 529MB heap used after FullGC ...500kb data = 281MB heap used after FullGC ...256kb data = 140MB heap used after FullGC ...128kb data = 68MB heap used after FullGC If I'm creating 1000 nodes with each 1000kb data and then deleting the nodes and after that creating 1000 nodes with 128kb data and deleting the nodes again, 68MB heap space is used. So it seems Zookeeper caches / doesn't clean up the last 500 deleted nodes. Is this a bug or are there configuration parameter to change that behaviour? -- Henry Robinson Software Engineer Cloudera 415-994-6679
Re: Zookeeper does not clean up deleted nodes
Thanks, this could be the reason. I only used a single zookeeper server, so it should act as a leader. So if I need to store larger files (about 1MB) the only option is to increase the heap space? I know that zookeeper is designed for small files, but I'm using zookeeper with solr and solr stores all the index configuration with large dictionaries in zookeeper. 2013/4/8 Benjamin Reed br...@apache.org are you looking at the leader or the follower? the leader keeps the last few transactions in memory to speed up syncing with new followers. that might be what you are seeing. On Mon, Apr 8, 2013 at 3:32 AM, Mathias Hodler mathias.hod...@gmail.com wrote: Hi, I made some tests and it seems like zookeeper doesn't clean up the last 500 deleted nodes. In my test I created nodes and deleted each node after it was created. I repeated this step 1000 times and then triggered a full gc. These are the results Creating 1000 Nodes and deleting 1000 Nodes and each node has... ...1000kb data = 529MB heap used after FullGC ...500kb data = 281MB heap used after FullGC ...256kb data = 140MB heap used after FullGC ...128kb data = 68MB heap used after FullGC If I'm creating 1000 nodes with each 1000kb data and then deleting the nodes and after that creating 1000 nodes with 128kb data and deleting the nodes again, 68MB heap space is used. So it seems Zookeeper caches / doesn't clean up the last 500 deleted nodes. Is this a bug or are there configuration parameter to change that behaviour?
Re: Zookeeper does not clean up deleted nodes
it would be very simple to make that 500 configurable. you should propose a change. On Mon, Apr 8, 2013 at 9:40 AM, Mathias Hodler mathias.hod...@gmail.comwrote: Thanks, this could be the reason. I only used a single zookeeper server, so it should act as a leader. So if I need to store larger files (about 1MB) the only option is to increase the heap space? I know that zookeeper is designed for small files, but I'm using zookeeper with solr and solr stores all the index configuration with large dictionaries in zookeeper. 2013/4/8 Benjamin Reed br...@apache.org are you looking at the leader or the follower? the leader keeps the last few transactions in memory to speed up syncing with new followers. that might be what you are seeing. On Mon, Apr 8, 2013 at 3:32 AM, Mathias Hodler mathias.hod...@gmail.com wrote: Hi, I made some tests and it seems like zookeeper doesn't clean up the last 500 deleted nodes. In my test I created nodes and deleted each node after it was created. I repeated this step 1000 times and then triggered a full gc. These are the results Creating 1000 Nodes and deleting 1000 Nodes and each node has... ...1000kb data = 529MB heap used after FullGC ...500kb data = 281MB heap used after FullGC ...256kb data = 140MB heap used after FullGC ...128kb data = 68MB heap used after FullGC If I'm creating 1000 nodes with each 1000kb data and then deleting the nodes and after that creating 1000 nodes with 128kb data and deleting the nodes again, 68MB heap space is used. So it seems Zookeeper caches / doesn't clean up the last 500 deleted nodes. Is this a bug or are there configuration parameter to change that behaviour?
Re: Zookeeper does not clean up deleted nodes
I created a new issue https://issues.apache.org/jira/browse/ZOOKEEPER-1687 2013/4/8 Benjamin Reed br...@apache.org it would be very simple to make that 500 configurable. you should propose a change. On Mon, Apr 8, 2013 at 9:40 AM, Mathias Hodler mathias.hod...@gmail.com wrote: Thanks, this could be the reason. I only used a single zookeeper server, so it should act as a leader. So if I need to store larger files (about 1MB) the only option is to increase the heap space? I know that zookeeper is designed for small files, but I'm using zookeeper with solr and solr stores all the index configuration with large dictionaries in zookeeper. 2013/4/8 Benjamin Reed br...@apache.org are you looking at the leader or the follower? the leader keeps the last few transactions in memory to speed up syncing with new followers. that might be what you are seeing. On Mon, Apr 8, 2013 at 3:32 AM, Mathias Hodler mathias.hod...@gmail.com wrote: Hi, I made some tests and it seems like zookeeper doesn't clean up the last 500 deleted nodes. In my test I created nodes and deleted each node after it was created. I repeated this step 1000 times and then triggered a full gc. These are the results Creating 1000 Nodes and deleting 1000 Nodes and each node has... ...1000kb data = 529MB heap used after FullGC ...500kb data = 281MB heap used after FullGC ...256kb data = 140MB heap used after FullGC ...128kb data = 68MB heap used after FullGC If I'm creating 1000 nodes with each 1000kb data and then deleting the nodes and after that creating 1000 nodes with 128kb data and deleting the nodes again, 68MB heap space is used. So it seems Zookeeper caches / doesn't clean up the last 500 deleted nodes. Is this a bug or are there configuration parameter to change that behaviour?