ZooKeeper partitioning and write scalability - a project report

Pramod Biligiri Tue, 17 Jun 2014 02:14:19 -0700

Hi all,
Thanks for the help offered by people on the list (esp Ted Dunning, Mitchi
Mutsuzaki and Vitalli Tymchyshyn for initial pointers).
We were able to make some progress on Partitioned ZooKeeper (
http://wiki.apache.org/hadoop/ZooKeeper/PartitionedZookeeper). Our project
report is available here:
http://www.slideshare.net/pramodbiligiri/zookeeper-partitioningprojectreport


We modified the ZooKeeper code to create a rudimentary proof-of-concept
implementation to test only the performance critical parts (the sharding
logic is hardcoded, leader failure/election is not handled etc)

Our experiments showed a 14% increase in throughput. Unfortunately we
didn't have time to drill down into the performance improvement in more
detail. For example, we couldn't successfully turn off disk-write caching
on EBS volumes, run with 3 quorums on each node, or try our changes on a
larger cluster. So I would take that value with a grain of salt.

Comments and criticism are welcome. If people see some value in this work,
I'll try to run some more experiments. The code is here:
https://bitbucket.org/pramodbiligiri/zk-multileader

The code changes were not much: We added a QuorumPeerThread class to
QuorumPeerMain that launches multiple QuorumPeers on each node, and a
client request is routed to the appropriate QuorumPeer by examining its
path. The existence of multiple quorums is abstracted in most other places
by a class QuorumPeerMulti that wraps multiple QuorumPeer-s and acts as a
facade. Note that each quorum on each node writes to its own transaction
log device and data directory.

Pramod
-- 
http://twitter.com/pramodbiligiri

ZooKeeper partitioning and write scalability - a project report

Reply via email to