Re: Multiple ZK clusters or a single, shared cluster?

2009-07-17 Thread Ted Dunning
This may be true for monster clusters being supported by a ZK cluster or for data-intensive operations like high throughput queuing, but many applications of ZK are incredibly low volume. For instance, with Katta, there is a read-modify-write when a search engine comes up, when a shard is assigned

Re: Multiple ZK clusters or a single, shared cluster?

2009-07-17 Thread Scott Carey
You don't need a dedicated disk for performance purposes if: You use an SSD (which the decent ones have sub 0.1ms write latency) Or You have a battery backed cache RAID card with write-back cache enabled so that synchronous writes return nearly instantly. Its just the synchronous write latency f

Re: Multiple ZK clusters or a single, shared cluster?

2009-07-17 Thread Jonathan Gray
Yup. I would say that the use of ZK by HBase today is very minimal. Very few writes at all, almost exclusively reads and still not that often. Dedicated resources would not make much of a difference. For users who are just testing, developing, or running clusters that are not highly loaded,

Re: Multiple ZK clusters or a single, shared cluster?

2009-07-17 Thread Benjamin Reed
you need a dedicated disk for the logDir, but not the dataDir. the reason is that the write to the log is in the critical path: we cannot commit changes until they have been synced to disk, so we want to make sure that we don't contend for the disk. the snapshots in the dataDir are done in an a

Re: Multiple ZK clusters or a single, shared cluster?

2009-07-17 Thread Jonathan Gray
Thanks for the input. Honestly, I'm thinking I need to have separate clusters. The version of ZK is one thing; but also for an application like HBase, we have had periods where we needed to patch ZK before it became part of a release. Keeping track of that on a shared cluster will be tricky,

Re: Multiple ZK clusters or a single, shared cluster?

2009-07-17 Thread Benjamin Reed
we designed zk to have high performance so that it can be shared by multiple applications. the main thing is that you use dedicated zk machines (with a dedicated disk for logging). once you have that in place, watch the load on your cluster, as long as you aren't saturating the cluster you shou

Re: Multiple ZK clusters or a single, shared cluster?

2009-07-17 Thread Mahadev Konar
Hi Jonathan, Regarding sharing ZooKeeper servers among applications, we actually recommend doing that. We ask our users to have dedicated 3-5 node clusters with dedicated machines(/disk) for zookeeper and lets applications share those 3-5 nodes clusters. But we do have have users (inside Yahoo!)

Multiple ZK clusters or a single, shared cluster?

2009-07-17 Thread Jonathan Gray
Hey guys, Been using ZK indirectly for a few months now in the HBase and Katta realms. Both of these applications make it really easy so you don't have to be involved much with managing your ZK cluster to support it. I'm now using ZK for a bunch of things internally, so now I'm manually con