We tested up to the ulimit (~16K) of connections against a single server and 
performance was ok, but I would definitely try to do some serious load testing 
before I put a system into production that I knew was going to have that load 
from the get-go.
The system degrades VERY ungracefully when you hit the ulimit for the process, 
so be sure to have enough ensemble nodes to spread those connections across 
that this won't happen. I think maybe there's a JIRA out to deal with this 
issue, not sure what the status is.

C

-----Original Message-----
From: Patrick Hunt [mailto:ph...@apache.org] 
Sent: Thursday, November 18, 2010 2:57 PM
To: zookeeper-user@hadoop.apache.org
Subject: Re: number of clients/watchers

fyi: I haven't heard of anyone running over 10k sessions. I've tried
20k before and had issues, you may want to look at this sooner rather
than later.

* Server gc tuning will be an issue (be sure to use cms/incremental).
* Be sure to disable clients accessing the leader (server configuration param).
* You may need to use the Observers feature to scale out this large.

Patrick

On Thu, Nov 18, 2010 at 10:31 AM, Jeremy Hanna
<jeremy.hanna1...@gmail.com> wrote:
>>> Can you clarify what you mean when you say 10-100K watchers? Do you mean 
>>> 10-100K clients with 1 active watch, or some lesser number of clients with 
>>> more watches, or a few clients doing a lot of watches and other clients 
>>> doing other things?
>
> Probably 10-100K clients each with 1 or 2 active watches.  The clients will 
> respond to watch events and sometimes initiate actions of their own.
>
>> here's a similar test setup I used:
>
> Thanks Patrick - it's really nice to have those numbers and test harness 
> basis.
>
> We're still in architecture mode so some of the details are still in flux, 
> but I think this gives us an idea.
>
> Thanks very much.
>
> On Nov 18, 2010, at 11:51 AM, Patrick Hunt wrote:
>
>> Camille, that's a very good question. Largest cluster I've heard about
>> is 10k sessions.
>>
>> Jeremy - largest I've ever tested was a 3 server cluster with ~500
>> sessions. Each session created 10k znodes (100bytes each znode) and
>> set 5 watches on each. So 5 million znodes and 25million watches. I
>> then had the sessions delete the znodes and looked for the
>> notifications. They were processed by the clients quite quickly (order
>> of seconds) iirc. Note: this required some GC tuning on the servers to
>> operate correctly (in particular cms and incremental gc was turned on
>> and sufficient memory was allocated for the heaps).
>>
>> here's a similar test setup I used:
>> http://wiki.apache.org/hadoop/ZooKeeper/ServiceLatencyOverview
>> this is the latency tester tool
>> https://github.com/phunt/zk-smoketest
>>
>> Patrick
>>
>> On Thu, Nov 18, 2010 at 9:44 AM, Fournier, Camille F. [Tech]
>> <camille.fourn...@gs.com> wrote:
>>> Can you clarify what you mean when you say 10-100K watchers? Do you mean 
>>> 10-100K clients with 1 active watch, or some lesser number of clients with 
>>> more watches, or a few clients doing a lot of watches and other clients 
>>> doing other things?
>>>
>>> -----Original Message-----
>>> From: Jeremy Hanna [mailto:jeremy.hanna1...@gmail.com]
>>> Sent: Thursday, November 18, 2010 12:15 PM
>>> To: zookeeper-user@hadoop.apache.org
>>> Subject: number of clients/watchers
>>>
>>> I had a question about number of clients against a zookeeper cluster.  I 
>>> was looking at having between 10,000 and 100,000 (towards 100,000) watchers 
>>> within a single datacenter at a given time.  Assuming that some fraction of 
>>> that number are active clients and the r/w ratio is well within the 
>>> zookeeper norms, is that number within the realm of possibility for 
>>> zookeeper?  We're going to do testing and benchmarking and things, but I 
>>> didn't want to go down a rabbit hole if this is simply too much for a 
>>> single zookeeper cluster to handle.   The numbers I've seen in blog posts 
>>> vary and I saw that the observers feature may be useful in this kind of 
>>> setting.
>>>
>>> Maybe I'm underestimating zookeeper or maybe I don't have enough 
>>> information to tell.  I'm just trying to see if zookeeper is a good fit for 
>>> our use case.
>>>
>>> Thanks.
>>>
>
>

Reply via email to