Enabling a large number of watches for a large number of clients
----------------------------------------------------------------

                 Key: ZOOKEEPER-1177
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1177
             Project: ZooKeeper
          Issue Type: Improvement
          Components: server
    Affects Versions: 3.3.3
            Reporter: Vishal Kathuria
            Assignee: Vishal Kathuria
             Fix For: 3.5.0


In my ZooKeeper, I see watch manager consuming several GB of memory and I dug a 
bit deeper.

In the scenario I am testing, I have 10K clients connected to an observer. 
There are about 20K znodes in ZooKeeper, each is about 1K - so about 20M data 
in total.
Each client fetches and puts watches on all the znodes. That is 200 million 
watches.

It seems a single watch takes about 100  bytes. I am currently at 14528037 
watches and according to the yourkit profiler, WatchManager has 1.2 G already. 
This is not going to work as it might end up needing 20G of RAM just for the 
watches.

So we need a more compact way of storing watches. Here are the possible 
solutions.
1. Use a bitmap instead of the current hashmap. In this approach, each znode 
would get a unique id when its gets created. For every session, we can keep 
track of a bitmap that indicates the set of znodes this session is watching. A 
bitmap, assuming a 100K znodes, would be 12K. For 10K sessions, we can keep 
track of watches using 120M instead of 20G.
2. This second idea is based on the observation that clients watch znodes in 
sets (for example all znodes under a folder). Multiple clients watch the same 
set and the total number of sets is a couple of orders of magnitude smaller 
than the total number of znodes. In my scenario, there are about 100 sets. So 
instead of keeping track of watches at the znode level, keep track of it at the 
set level. It may mean that get may also need to be implemented at the set 
level. With this, we can save the watches in 100M.


Are there any other suggestions of solutions?

Thanks

 


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to