Hi, I'm Patrick from the ZooKeeper team. Recently we've (the ZK dev community) been working more closely with the HBase dev team. In particular to ensure that ZK-HBase interaction is the best it can be and improve things where it's not. We are also looking at how HBase might take more advantage of ZK in future. Please see the following for some running commentary (esp see the "use cases" link):
http://wiki.apache.org/hadoop/ZooKeeper/HBaseAndZooKeeper

One of the issues we identified early on is that HBase users don't always know what to expect from ZooKeeper wrt scaling. As a result of that discussion I created the following document which details ZK scalability with varying CPUs available (1,2,4 cores) while varying the client load and ZK configuration. See the summary section near the top for some graphs that give a good overview (note: the load used in this test is at least a couple orders of magnitude great than that applied by HBase currently):
http://bit.ly/4ekN8G

If you are operating a ZK cluster, or would just like to know more about ZK operations (esp monitoring) you might also want to review the ZooKeeper operations manual:
http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html

We also have a running wiki page to help with troubleshooting ZK issues. This is based on problems users have had in the past, it also gives some ideas on how to troubleshoot issues if you do experience problems:
http://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting

I hope you find these useful. If you'd like more detail on ZooKeeper I encourage you to subscribe to the ZK user list:
http://hadoop.apache.org/zookeeper/mailing_lists.html#Users
You can also follow me on http://twitter.com/phunt where I occasionally post issues of interest to the ZK/Hadoop community.

Regards,

Patrick

Reply via email to