Re: Share Zookeeper instance and Connection Limits
If you do test large (and by large I'm talking about millions of znodes and 10s of millions of watches) be sure to allocate enough memory, get the latest JVM (1.6.0_17) and turn on incremental/CMS GC in the sun JVM. You may find this helpful as well for tracking progress in real time: http://bit.ly/1iMZdg Patrick Thiago Borges wrote: On 16/12/2009 19:15, Patrick Hunt wrote: Rt, this was with 1gE. No, I don't know anyone who has done this. But it should be easy enough for you to test. Limit the amount of data you are storing in znodes and it shouldn't be too terrible. Ok. I will do experiments in a lab with 40 Core 2 duo machines, 2G RAM, and a common 5400 rpm disk. Try to believe that ;) The data inserted in nodes are quite small (<1k, 2k), but the amount of znodes and watches is large. Not currently, this feature is looking for someone interested enough to provide some patches ;-) https://issues.apache.org/jira/browse/ZOOKEEPER-546 Maybe in a near future! ;)
Re: Share Zookeeper instance and Connection Limits
ZK does a good enough job at avoiding seeks that if you give it some dedicated disks, you may not see much speedup with SSD's. On Fri, Dec 18, 2009 at 9:13 AM, Thiago Borges wrote: > Maybe in some specifically environment as described in ZOOKEEPER-546. But > yes, I agree, it's only a idea. The world are changing to SSD's too! -- Ted Dunning, CTO DeepDyve
Re: Share Zookeeper instance and Connection Limits
On 16/12/2009 20:06, Benjamin Reed wrote: I agree with Ted, it doesn't seem like a good idea to do in practice. Maybe in some specifically environment as described in ZOOKEEPER-546. But yes, I agree, it's only a idea. The world are changing to SSD's too! 1) use tmpfs My memory will be split in 2, ok? 2) you can set forceSync to "no" in the configuration file to disable syncing to disk before acknowledging responses Good. 3) if you really want to make the disk write go away, you can modify the SyncRequestProcessor in the code More good. Thanks for indicating the path! -- Thiago Borges
Re: Share Zookeeper instance and Connection Limits
On 16/12/2009 19:15, Patrick Hunt wrote: Rt, this was with 1gE. No, I don't know anyone who has done this. But it should be easy enough for you to test. Limit the amount of data you are storing in znodes and it shouldn't be too terrible. Ok. I will do experiments in a lab with 40 Core 2 duo machines, 2G RAM, and a common 5400 rpm disk. Try to believe that ;) The data inserted in nodes are quite small (<1k, 2k), but the amount of znodes and watches is large. Not currently, this feature is looking for someone interested enough to provide some patches ;-) https://issues.apache.org/jira/browse/ZOOKEEPER-546 Maybe in a near future! ;) -- Thiago Borges
Re: Share Zookeeper instance and Connection Limits
I agree with Ted, it doesn't seem like a good idea to do in practice. however, you do have a couple of options if you are just testing things: 1) use tmpfs 2) you can set forceSync to "no" in the configuration file to disable syncing to disk before acknowledging responses 3) if you really want to make the disk write go away, you can modify the SyncRequestProcessor in the code ben Ted Dunning wrote: I think that htis would be a very bad idea because of restart issues. As it stands, ZK reads from disk snapshots on startup to avoid moving as much data from other members of the cluster. You might consider putting the snapshots and log on a tmpfs file system if you really, really want this. On Wed, Dec 16, 2009 at 1:08 PM, Thiago Borges wrote: Can Zookeeper ensemble runs only in memory rather than write in both memory and disk? This makes senses since I have a high reliable system? (Of course at some time we need a "dump" to shutdown and restart the entire system). Well, the disk IO or network first limits the throughput? Thanks for you quick response. I'm studding Zookeeper in my master thesis, for coordinate distributed index structures.
Re: Share Zookeeper instance and Connection Limits
I think that htis would be a very bad idea because of restart issues. As it stands, ZK reads from disk snapshots on startup to avoid moving as much data from other members of the cluster. You might consider putting the snapshots and log on a tmpfs file system if you really, really want this. On Wed, Dec 16, 2009 at 1:08 PM, Thiago Borges wrote: > Can Zookeeper ensemble runs only in memory rather than write in both memory > and disk? This makes senses since I have a high reliable system? (Of course > at some time we need a "dump" to shutdown and restart the entire system). > > Well, the disk IO or network first limits the throughput? > > Thanks for you quick response. I'm studding Zookeeper in my master thesis, > for coordinate distributed index structures. > -- Ted Dunning, CTO DeepDyve
Re: Share Zookeeper instance and Connection Limits
Thiago Borges wrote: On 16/12/2009 16:45, Patrick Hunt wrote: This test has 910 clients (sessions) involved: http://hadoop.apache.org/zookeeper/docs/current/zookeeperOver.html#Performance We have users with 10k sessions accessing a single 5 node ZK ensemble. That's the largest I know about that's in production. I've personally tested up to 20k sessions attaching to a 3 node ensemble with 10 second session timeout and it was fine (although I didn't do much other than test session establishment and teardown). Also see this: http://bit.ly/4ekN8G The network of this test is a gigabit ethernet, ok? You know someone with was running ensembles in 100 Mbit/s ethernet? Rt, this was with 1gE. No, I don't know anyone who has done this. But it should be easy enough for you to test. Limit the amount of data you are storing in znodes and it shouldn't be too terrible. Can Zookeeper ensemble runs only in memory rather than write in both memory and disk? This makes senses since I have a high reliable system? (Of course at some time we need a "dump" to shutdown and restart the entire system). Not currently, this feature is looking for someone interested enough to provide some patches ;-) https://issues.apache.org/jira/browse/ZOOKEEPER-546 Well, the disk IO or network first limits the throughput? I believe the current limitation is CPU bound on the ack processor (given that you have a dedicated txlog device). So neither afaik. Thanks for you quick response. I'm studding Zookeeper in my master thesis, for coordinate distributed index structures. NP. Enjoy. Patrick
Re: Share Zookeeper instance and Connection Limits
On 16/12/2009 16:45, Patrick Hunt wrote: This test has 910 clients (sessions) involved: http://hadoop.apache.org/zookeeper/docs/current/zookeeperOver.html#Performance We have users with 10k sessions accessing a single 5 node ZK ensemble. That's the largest I know about that's in production. I've personally tested up to 20k sessions attaching to a 3 node ensemble with 10 second session timeout and it was fine (although I didn't do much other than test session establishment and teardown). Also see this: http://bit.ly/4ekN8G The network of this test is a gigabit ethernet, ok? You know someone with was running ensembles in 100 Mbit/s ethernet? Can Zookeeper ensemble runs only in memory rather than write in both memory and disk? This makes senses since I have a high reliable system? (Of course at some time we need a "dump" to shutdown and restart the entire system). Well, the disk IO or network first limits the throughput? Thanks for you quick response. I'm studding Zookeeper in my master thesis, for coordinate distributed index structures. -- Thiago Borges
Re: Share Zookeeper instance and Connection Limits
Thiago Borges wrote: I read the documentation at zoo site and can't find some text about sharing/limits of zoo clients connections. No limits in particular to ZK itself (given enough memory) - usually the limitations are due to the max number of file descriptors the host OS allows. Often this is on the order of 1-8k, check your ulimit. I only see the parameter in .conf file about the max number of connections per client. This is to limit "DOS" attacks - it was added after we saw issues with buggy client implementations that would create infinite numbers of sessions with the ZK service. Eventually running into the FD limit problem I mentioned. Can someone point me some documentation about sharing the zookeeper connections? Can I do this among different threads? The API docs have those details: http://hadoop.apache.org/zookeeper/docs/current/api/index.html generally the client interface is thread safe though. And about client connections limits and how much throughput decreases when the number of connections increase? This test has 910 clients (sessions) involved: http://hadoop.apache.org/zookeeper/docs/current/zookeeperOver.html#Performance We have users with 10k sessions accessing a single 5 node ZK ensemble. That's the largest I know about that's in production. I've personally tested up to 20k sessions attaching to a 3 node ensemble with 10 second session timeout and it was fine (although I didn't do much other than test session establishment and teardown). Also see this: http://bit.ly/4ekN8G Patrick
Share Zookeeper instance and Connection Limits
I read the documentation at zoo site and can't find some text about sharing/limits of zoo clients connections. I only see the parameter in .conf file about the max number of connections per client. Can someone point me some documentation about sharing the zookeeper connections? Can I do this among different threads? And about client connections limits and how much throughput decreases when the number of connections increase? Thanks, -- Thiago Borges