Re: How large an ensemble can one build with Zookeeper?
Chubby and Zookeeper have very different ways at getting to similar purposes. Chubby is a locking service, while zookeeper is all about avoiding locks. Zookeeper is better described as a coordination service. Regarding performance, I am pretty sure that Zookeeper could keep up with some pretty enormous clusters quite easily. I would expect that the performance of the underlying file system is more like to be the critical performance issue. On Wed, Mar 4, 2009 at 6:00 AM, David Pollak wrote: > > I understand that Google uses Chubby (a ZooKeeper clone... or vice versa > :-) > ) as the coordination mechanism for Big Table. Do you have any insight > into > Chubby's performance characteristics... and if it would be possible to > build > a Big Table clone that had scalability characteristics of Big Table with > ZooKeeper as the underlying coordinator? > >
Re: How large an ensemble can one build with Zookeeper?
I realize this is discussion is over, but i did want to make one quick clarification. when we talk about ensembles, we are talking about the servers that make up the zookeeper service. we refer to the servers that use the zookeeper service as clients. we have systems here that use ensembles of five servers to provide zookeeper service to thousands of client servers without problem. ben Chad Harrington wrote: Clearly Zookeeper can handle ensembles of a dozen or so servers. How large an ensemble can one build with Zookeeper? 100 servers? 10,000 servers? Are there limitations that make the system unusable at large numbers of servers? Thanks,
Re: How large an ensemble can one build with Zookeeper?
JD, When I last looked at HBase (about a year ago), the performance was lacking. Have there been material improvements in HBase's performance in the last year? Thanks, David PS -- If this is not the correct list for such questions, I pre-apologize. Just whack me with a 2x4 and I'll take the discussion off the ZooKeeper list. On Wed, Mar 4, 2009 at 6:02 AM, Jean-Daniel Cryans wrote: > David, > > This is exactly what we are doing in the HBase project (www.hbase.org). > Zookeeper is currently being integrated for our next major version and some > parts are already in place. > > Regards, > > J-D > > On Wed, Mar 4, 2009 at 9:00 AM, David Pollak > wrote: > > > On Tue, Mar 3, 2009 at 9:33 PM, Ted Dunning > wrote: > > > > > zookeeper is not really what you would call a scalable system because > all > > > transactions that are updates go through the leader for serialization. > > > Zookeeper is, instead, a high throughput HA system. That said, the > > > throughput of a modest zookeeper cluster is fairly prodigous so for > the > > > normal application of coordinating a large cluster, these limits are > > beyond > > > what just about anyone needs. > > > > > > For other uses, though, 50 K updates per second wouldn't cut it. > > > > > > I understand that Google uses Chubby (a ZooKeeper clone... or vice versa > > :-) > > ) as the coordination mechanism for Big Table. Do you have any insight > > into > > Chubby's performance characteristics... and if it would be possible to > > build > > a Big Table clone that had scalability characteristics of Big Table with > > ZooKeeper as the underlying coordinator? > > > > > > > > > > > > > > > > Sent from my iPhone > > > > > > > > > On Mar 3, 2009, at 17:30, Chad Harrington > > > wrote: > > > > > > Clearly Zookeeper can handle ensembles of a dozen or so servers. How > > >> large > > >> an ensemble can one build with Zookeeper? 100 servers? 10,000 > servers? > > >> Are there limitations that make the system unusable at large numbers > of > > >> servers? > > >> > > >> Thanks, > > >> > > >> -- > > >> Chad Harrington > > >> CEO > > >> DataScaler, Inc. > > >> charring...@datascaler.com > > >> 201A Ravendale Dr. > > >> Mountain View, CA 94043 > > >> Phone: 650-515-3437 > > >> Fax: 650-887-1544 > > >> > > > > > > > > > -- > > Lift, the simply functional web framework http://liftweb.net > > Beginning Scala http://www.apress.com/book/view/1430219890 > > Follow me: http://twitter.com/dpp > > Git some: http://github.com/dpp > > > -- Lift, the simply functional web framework http://liftweb.net Beginning Scala http://www.apress.com/book/view/1430219890 Follow me: http://twitter.com/dpp Git some: http://github.com/dpp
Re: How large an ensemble can one build with Zookeeper?
David, This is exactly what we are doing in the HBase project (www.hbase.org). Zookeeper is currently being integrated for our next major version and some parts are already in place. Regards, J-D On Wed, Mar 4, 2009 at 9:00 AM, David Pollak wrote: > On Tue, Mar 3, 2009 at 9:33 PM, Ted Dunning wrote: > > > zookeeper is not really what you would call a scalable system because all > > transactions that are updates go through the leader for serialization. > > Zookeeper is, instead, a high throughput HA system. That said, the > > throughput of a modest zookeeper cluster is fairly prodigous so for the > > normal application of coordinating a large cluster, these limits are > beyond > > what just about anyone needs. > > > > For other uses, though, 50 K updates per second wouldn't cut it. > > > I understand that Google uses Chubby (a ZooKeeper clone... or vice versa > :-) > ) as the coordination mechanism for Big Table. Do you have any insight > into > Chubby's performance characteristics... and if it would be possible to > build > a Big Table clone that had scalability characteristics of Big Table with > ZooKeeper as the underlying coordinator? > > > > > > > > > > Sent from my iPhone > > > > > > On Mar 3, 2009, at 17:30, Chad Harrington > > wrote: > > > > Clearly Zookeeper can handle ensembles of a dozen or so servers. How > >> large > >> an ensemble can one build with Zookeeper? 100 servers? 10,000 servers? > >> Are there limitations that make the system unusable at large numbers of > >> servers? > >> > >> Thanks, > >> > >> -- > >> Chad Harrington > >> CEO > >> DataScaler, Inc. > >> charring...@datascaler.com > >> 201A Ravendale Dr. > >> Mountain View, CA 94043 > >> Phone: 650-515-3437 > >> Fax: 650-887-1544 > >> > > > > > -- > Lift, the simply functional web framework http://liftweb.net > Beginning Scala http://www.apress.com/book/view/1430219890 > Follow me: http://twitter.com/dpp > Git some: http://github.com/dpp >
Re: How large an ensemble can one build with Zookeeper?
On Tue, Mar 3, 2009 at 9:33 PM, Ted Dunning wrote: > zookeeper is not really what you would call a scalable system because all > transactions that are updates go through the leader for serialization. > Zookeeper is, instead, a high throughput HA system. That said, the > throughput of a modest zookeeper cluster is fairly prodigous so for the > normal application of coordinating a large cluster, these limits are beyond > what just about anyone needs. > > For other uses, though, 50 K updates per second wouldn't cut it. I understand that Google uses Chubby (a ZooKeeper clone... or vice versa :-) ) as the coordination mechanism for Big Table. Do you have any insight into Chubby's performance characteristics... and if it would be possible to build a Big Table clone that had scalability characteristics of Big Table with ZooKeeper as the underlying coordinator? > > > > Sent from my iPhone > > > On Mar 3, 2009, at 17:30, Chad Harrington > wrote: > > Clearly Zookeeper can handle ensembles of a dozen or so servers. How >> large >> an ensemble can one build with Zookeeper? 100 servers? 10,000 servers? >> Are there limitations that make the system unusable at large numbers of >> servers? >> >> Thanks, >> >> -- >> Chad Harrington >> CEO >> DataScaler, Inc. >> charring...@datascaler.com >> 201A Ravendale Dr. >> Mountain View, CA 94043 >> Phone: 650-515-3437 >> Fax: 650-887-1544 >> > -- Lift, the simply functional web framework http://liftweb.net Beginning Scala http://www.apress.com/book/view/1430219890 Follow me: http://twitter.com/dpp Git some: http://github.com/dpp
Re: How large an ensemble can one build with Zookeeper?
zookeeper is not really what you would call a scalable system because all transactions that are updates go through the leader for serialization. Zookeeper is, instead, a high throughput HA system. That said, the throughput of a modest zookeeper cluster is fairly prodigous so for the normal application of coordinating a large cluster, these limits are beyond what just about anyone needs. For other uses, though, 50 K updates per second wouldn't cut it. Sent from my iPhone On Mar 3, 2009, at 17:30, Chad Harrington wrote: Clearly Zookeeper can handle ensembles of a dozen or so servers. How large an ensemble can one build with Zookeeper? 100 servers? 10,000 servers? Are there limitations that make the system unusable at large numbers of servers? Thanks, -- Chad Harrington CEO DataScaler, Inc. charring...@datascaler.com 201A Ravendale Dr. Mountain View, CA 94043 Phone: 650-515-3437 Fax: 650-887-1544
Re: How large an ensemble can one build with Zookeeper?
HI Chad, The maximum number of zookeeper servers we have tested with is 13. Even with 13 the performance starts to degrade very quickly (compared to ensemble of 5 and 7). I am not sure we have the current numbers (we have made 3x or so performance improvements) but with the old number in zookeeper.pdf on http://wiki.apache.org/hadoop/ZooKeeper/ZooKeeperPresentations The slide is at the end. You can see that the performance drops with 13 servers. We usually suggest 5 or 7 servers for ZooKeeper. We can get around 20K-30K writes per second and more than 50K reads per second from an ensemble of 5 servers (as of now with performance enhancements). With 5 servers you can tolerate a failure of 2 nodes. Please take a look at zookeeper presentations - http://wiki.apache.org/hadoop/ZooKeeper/ZooKeeperPresentations To find out more about Zookeeper. What is the rationale behind having such a huge amount of zookeeper servers? Thanks mahadev On 3/3/09 5:30 PM, "Chad Harrington" wrote: > Clearly Zookeeper can handle ensembles of a dozen or so servers. How large > an ensemble can one build with Zookeeper? 100 servers? 10,000 servers? > Are there limitations that make the system unusable at large numbers of > servers? > > Thanks,
How large an ensemble can one build with Zookeeper?
Clearly Zookeeper can handle ensembles of a dozen or so servers. How large an ensemble can one build with Zookeeper? 100 servers? 10,000 servers? Are there limitations that make the system unusable at large numbers of servers? Thanks, -- Chad Harrington CEO DataScaler, Inc. charring...@datascaler.com 201A Ravendale Dr. Mountain View, CA 94043 Phone: 650-515-3437 Fax: 650-887-1544