Re: Zookeeper does not clean up deleted nodes

2013-04-08 Thread Benjamin Reed
are you looking at the leader or the follower? the leader keeps the last
few transactions in memory to speed up syncing with new followers. that
might be what you are seeing.


On Mon, Apr 8, 2013 at 3:32 AM, Mathias Hodler mathias.hod...@gmail.comwrote:

 Hi,

 I made some tests and it seems like zookeeper doesn't clean up the last 500
 deleted nodes.

 In my test I created nodes and deleted each node after it was created. I
 repeated this step 1000 times and then triggered a full gc. These are the
 results

 Creating 1000 Nodes and deleting 1000 Nodes and each node has...
 ...1000kb data = 529MB heap used after FullGC
 ...500kb data = 281MB heap used after FullGC
 ...256kb data = 140MB heap used after FullGC
 ...128kb data =  68MB heap used after FullGC

 If I'm creating 1000 nodes with each 1000kb data and then deleting the
 nodes and after that creating 1000 nodes with 128kb data and deleting the
 nodes again, 68MB heap space is used.

 So it seems Zookeeper caches / doesn't clean up the last 500 deleted nodes.

 Is this a bug or are there configuration parameter to change that
 behaviour?



Re: Zookeeper does not clean up deleted nodes

2013-04-08 Thread Henry Robinson
As Ben says, this is a feature, not a bug. However, the memory usage is
still excessive; see this jira:
https://issues.apache.org/jira/browse/ZOOKEEPER-1473

Henry

On 8 April 2013 09:31, Benjamin Reed br...@apache.org wrote:

 are you looking at the leader or the follower? the leader keeps the last
 few transactions in memory to speed up syncing with new followers. that
 might be what you are seeing.


 On Mon, Apr 8, 2013 at 3:32 AM, Mathias Hodler mathias.hod...@gmail.com
 wrote:

  Hi,
 
  I made some tests and it seems like zookeeper doesn't clean up the last
 500
  deleted nodes.
 
  In my test I created nodes and deleted each node after it was created. I
  repeated this step 1000 times and then triggered a full gc. These are the
  results
 
  Creating 1000 Nodes and deleting 1000 Nodes and each node has...
  ...1000kb data = 529MB heap used after FullGC
  ...500kb data = 281MB heap used after FullGC
  ...256kb data = 140MB heap used after FullGC
  ...128kb data =  68MB heap used after FullGC
 
  If I'm creating 1000 nodes with each 1000kb data and then deleting the
  nodes and after that creating 1000 nodes with 128kb data and deleting the
  nodes again, 68MB heap space is used.
 
  So it seems Zookeeper caches / doesn't clean up the last 500 deleted
 nodes.
 
  Is this a bug or are there configuration parameter to change that
  behaviour?
 




-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


Re: Zookeeper does not clean up deleted nodes

2013-04-08 Thread Mathias Hodler
Thanks, this could be the reason. I only used a single zookeeper server, so
it should act as a leader.

So if I need to store larger files (about 1MB) the only option is to
increase the heap space? I know that zookeeper is designed for small files,
but I'm using zookeeper with solr and solr stores all the index
configuration with large dictionaries in zookeeper.


2013/4/8 Benjamin Reed br...@apache.org

 are you looking at the leader or the follower? the leader keeps the last
 few transactions in memory to speed up syncing with new followers. that
 might be what you are seeing.


 On Mon, Apr 8, 2013 at 3:32 AM, Mathias Hodler mathias.hod...@gmail.com
 wrote:

  Hi,
 
  I made some tests and it seems like zookeeper doesn't clean up the last
 500
  deleted nodes.
 
  In my test I created nodes and deleted each node after it was created. I
  repeated this step 1000 times and then triggered a full gc. These are the
  results
 
  Creating 1000 Nodes and deleting 1000 Nodes and each node has...
  ...1000kb data = 529MB heap used after FullGC
  ...500kb data = 281MB heap used after FullGC
  ...256kb data = 140MB heap used after FullGC
  ...128kb data =  68MB heap used after FullGC
 
  If I'm creating 1000 nodes with each 1000kb data and then deleting the
  nodes and after that creating 1000 nodes with 128kb data and deleting the
  nodes again, 68MB heap space is used.
 
  So it seems Zookeeper caches / doesn't clean up the last 500 deleted
 nodes.
 
  Is this a bug or are there configuration parameter to change that
  behaviour?
 



Re: Zookeeper does not clean up deleted nodes

2013-04-08 Thread Benjamin Reed
it would be very simple to make that 500 configurable. you should propose a
change.


On Mon, Apr 8, 2013 at 9:40 AM, Mathias Hodler mathias.hod...@gmail.comwrote:

 Thanks, this could be the reason. I only used a single zookeeper server, so
 it should act as a leader.

 So if I need to store larger files (about 1MB) the only option is to
 increase the heap space? I know that zookeeper is designed for small files,
 but I'm using zookeeper with solr and solr stores all the index
 configuration with large dictionaries in zookeeper.


 2013/4/8 Benjamin Reed br...@apache.org

  are you looking at the leader or the follower? the leader keeps the last
  few transactions in memory to speed up syncing with new followers. that
  might be what you are seeing.
 
 
  On Mon, Apr 8, 2013 at 3:32 AM, Mathias Hodler mathias.hod...@gmail.com
  wrote:
 
   Hi,
  
   I made some tests and it seems like zookeeper doesn't clean up the last
  500
   deleted nodes.
  
   In my test I created nodes and deleted each node after it was created.
 I
   repeated this step 1000 times and then triggered a full gc. These are
 the
   results
  
   Creating 1000 Nodes and deleting 1000 Nodes and each node has...
   ...1000kb data = 529MB heap used after FullGC
   ...500kb data = 281MB heap used after FullGC
   ...256kb data = 140MB heap used after FullGC
   ...128kb data =  68MB heap used after FullGC
  
   If I'm creating 1000 nodes with each 1000kb data and then deleting the
   nodes and after that creating 1000 nodes with 128kb data and deleting
 the
   nodes again, 68MB heap space is used.
  
   So it seems Zookeeper caches / doesn't clean up the last 500 deleted
  nodes.
  
   Is this a bug or are there configuration parameter to change that
   behaviour?
  
 



Re: Zookeeper does not clean up deleted nodes

2013-04-08 Thread Mathias Hodler
I created a new issue https://issues.apache.org/jira/browse/ZOOKEEPER-1687


2013/4/8 Benjamin Reed br...@apache.org

 it would be very simple to make that 500 configurable. you should propose a
 change.


 On Mon, Apr 8, 2013 at 9:40 AM, Mathias Hodler mathias.hod...@gmail.com
 wrote:

  Thanks, this could be the reason. I only used a single zookeeper server,
 so
  it should act as a leader.
 
  So if I need to store larger files (about 1MB) the only option is to
  increase the heap space? I know that zookeeper is designed for small
 files,
  but I'm using zookeeper with solr and solr stores all the index
  configuration with large dictionaries in zookeeper.
 
 
  2013/4/8 Benjamin Reed br...@apache.org
 
   are you looking at the leader or the follower? the leader keeps the
 last
   few transactions in memory to speed up syncing with new followers. that
   might be what you are seeing.
  
  
   On Mon, Apr 8, 2013 at 3:32 AM, Mathias Hodler 
 mathias.hod...@gmail.com
   wrote:
  
Hi,
   
I made some tests and it seems like zookeeper doesn't clean up the
 last
   500
deleted nodes.
   
In my test I created nodes and deleted each node after it was
 created.
  I
repeated this step 1000 times and then triggered a full gc. These are
  the
results
   
Creating 1000 Nodes and deleting 1000 Nodes and each node has...
...1000kb data = 529MB heap used after FullGC
...500kb data = 281MB heap used after FullGC
...256kb data = 140MB heap used after FullGC
...128kb data =  68MB heap used after FullGC
   
If I'm creating 1000 nodes with each 1000kb data and then deleting
 the
nodes and after that creating 1000 nodes with 128kb data and deleting
  the
nodes again, 68MB heap space is used.
   
So it seems Zookeeper caches / doesn't clean up the last 500 deleted
   nodes.
   
Is this a bug or are there configuration parameter to change that
behaviour?