Hello Norman, Eric and the list, 2012/1/25 Norman Maurer <[email protected]>: > Thanks for the link... After thinking more about it I guess using the > Version of the znode would be the easist solution and would fullfill > all the needs. Having Integer.MAX_VALUE should also be good enough as > this is per mailbox. > > Bye, > Norman >
After considering all the information on the zookeeper web site, [1], [2] and all the input from the posts here it's clear to me that the first version will use plain ZooKeeper and rely on znode version for sequence generation for both UID's and ModSeq. This should scale very well with a single Zk ensemble to the number of millions. After that we can use multiple Zk ensembles where each ensemble should manage a shard of the mailboxes. The first thing that comes to mind is the way Debian stores packages [3], where they use the first letter of the package as a directory to group all packages that start with the same name into a single directory. This way we can make an ensemble handle all mailboxes that start with 0-4 and another that handles 5-9. This way, considering the mailboxes are generated uniformly, we can split the load in half so we have horizontal scalability. I will begin implementation sometime next week (hopefully). Cheers, [1] http://zookeeper.apache.org/doc/current/zookeeperOver.html#fg_zkPerfReliability [2] http://wiki.apache.org/hadoop/ZooKeeper/ServiceLatencyOverview [3] ftp://ftp.be.debian.org/debian/pool/main/ -- Ioan Eugen Stan http://ieugen.blogspot.com/
