Hi Ted, I was briefly familiar with multi, and I just took a look at that part of the code. It seems to provide a way to implement transactions.
I guess you mean that you can't parallellize the workload because a multi command might require locking all the containers? Let me know if I'm missing something. Thanks, Pramod On Sun, May 18, 2014 at 9:12 PM, Ted Dunning <[email protected]> wrote: > Pramod, > > Have you looked at the multi command? That might cause you some serious > heartburn. > > > > > On Sun, May 18, 2014 at 8:25 PM, Pramod Biligiri > <[email protected]>wrote: > > > Hi, > > [Let me know if you want this thread moved to the Dev list (or even to > > JIRA). I was only seeing automated mails there so I thought I'll go ahead > > and post here] > > > > I have been looking at the codebase the last couple of days (see my notes > > regarding the same here: > > > > > https://docs.google.com/document/d/1TcohOWsUBXehS-E50bYY77p8SnGsF3IrBtu_LleP80U/edit > > ). > > > > We are planning to do a proof-of-concept for the partitioning concept as > > part of a class project, and measure any possible performance gains. > Since > > we're new to Zookeeper and short on time, it may not be the *right* way > to > > do it, but I hope it can give some pointers for the future. > > > > Design approaches to implement a partitioned Zookeeper > > > > For starters, let's assume we only parallelize accesses to paths starting > > with a different top-level prefix, i.e. /app1/child1, /app2/child1, > /config > > etc > > > > Possible approach: > > > > Have a different tree object for each top-level node (/app1, /app2 etc). > > This loosely corresponds to a container in the Wiki page [1], and > > corresponds to the DataTree class in the codebase > > > > - As soon as a request comes in, associate it with one of the trees. > Since > > each request necessarily has a path associated with it, this is possible. > > > > - Then, all the queues that are used to process requests should operate > > parallelly on these different trees. This can be done by having multiple > > queues - one for each container. > > > > Potential issues: > > > > - Whether ZK code is designed to work with multiple trees instead of just > > one > > > > - Whether the queuing process (which uses RequestProcessors) is designed > to > > handle multiple queues > > > > - Make sure performance actually improves, and does not degrade! > > > > Discussion: > > > > - Where is the performance benefit actually going to come from? > > > > Intuitively, we might think that parallel trees might give a benefit, but > > since each node logs all change records to disk before applying them, > isn't > > disk the throughput bottleneck? If I remember right, the ZK paper says > that > > with proper configs, they are able to make ZK I/O bound. > > > > So along with having separate trees and associated processing, should we > > also have separate logging to disk for each tree? Will this actually help > > in improving write speeds to disk? > > > > References: > > > > 1. The wiki page: > > http://wiki.apache.org/hadoop/ZooKeeper/PartitionedZookeeper > > > > 2. The JIRA discussion: > > https://issues.apache.org/jira/browse/ZOOKEEPER-646 > > > > 3. In this blog post, see the section called Scalability and Hashing > > Zookeeper clusters: > > > > > http://ria101.wordpress.com/2010/05/12/locking-and-transactions-over-cassandra-using-cage > > > > Thanks, > > Pramod > > -- > > http://twitter.com/pramodbiligiri > > > > > > On Fri, May 16, 2014 at 10:56 PM, Pramod Biligiri > > <[email protected]>wrote: > > > > > Thanks Michi, > > > That was a very useful link! :) > > > > > > Pramod > > > > > > > > > On Fri, May 16, 2014 at 3:37 PM, Michi Mutsuzaki < > [email protected] > > >wrote: > > > > > >> Hi Pramod, > > >> > > >> No it has not been implemented, and I'm not aware of any recipes. > > >> There is an open JIRA for this feature. > > >> > > >> https://issues.apache.org/jira/browse/ZOOKEEPER-646 > > >> > > >> On Thu, May 15, 2014 at 12:59 PM, Pramod Biligiri > > >> <[email protected]> wrote: > > >> > Hi, > > >> > The Zookeeper wiki talks about Partitioned Zookeeper: > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/ZOOKEEPER/PartitionedZooKeeper > > >> > > > >> > I wanted to know if that has already been implemented or not. If > not, > > >> are > > >> > there some recipes which can make Zookeeper behave in that way? > > >> > > > >> > Thanks. > > >> > > > >> > Pramod > > >> > > > > > > > > >
