[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048510#comment-13048510
 ] 

Hari A V commented on ZOOKEEPER-646:
------------------------------------

Hi Kay, 

I am looking forward to do a prototype on this. I would be very much interested 
to know the practical uses cases for Partitioned Zookeeper which you have in 
mind. As per my understanding, the very high level problem it tries to solve is 
write throughput scalability. i.e. When we add more Zookeeper nodes, we should 
be able to get more "write throughput". 

>From "https://cwiki.apache.org/ZOOKEEPER/partitionedzookeeper.html";
 "By having distinct ensembles handling different portions of the state, we end 
up relaxing the ordering guarantees" 
How different it is from directly running separate ensembles ? One can as well 
run different Zookeeper cluster to achieve this right? Whether the solution 
also address running multiple name spaces in , say an existing 3 Node Zookeeper 
cluster.  

I can think of something like this - 
Currently Write operations from all clients are processed sequentially by the 
Leader Zookeeper. The suggestion is to provide a provision for parallel writes 
for unrelated data in the same ensemble. For eg: In a cluster setup, the same 
ZK ensebmle may be used by Hbase for its metadata and other components for 
cluster confuration management. We don’t need to queue these operations and 
perform them sequentially. They can go parallel. But still all HBase operations 
may still need to be sequential to keep order of operations.

Here 
(http://ria101.wordpress.com/2010/05/12/locking-and-transactions-over-cassandra-using-cages/)
 I found another idea of Hash based partitioning for Zookeeper.
"The solution we suggest is simply to run more than one ZooKeeper cluster for 
the purposes of locking and transactions, and simply to hash locks and 
transactions onto particular clusters". 
Here they want to address about the locks. I am thinking of performing a hash 
on the "root nodes" itself (or introduce partition name) and perform operations 
paralelly in ZK Server(In most of the scenarios, znodes "/conf" and "/leaders" 
may be unrelated). Its more of running multiple partitions in the same 
ensemble. Effectively make writes paralell in Leader ZK in an ensemble. Still 
need to think more on transaction logs and snapshotting aspects and how this 
will be affected. 

I would be glad to hear from you guys.  

- Hari

> Namespace partitioning in ZK 
> -----------------------------
>
>                 Key: ZOOKEEPER-646
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-646
>             Project: ZooKeeper
>          Issue Type: New Feature
>            Reporter: Kay Kay
>
> Tracking JIRA for namespace partitioning in ZK 
> From the mailing list (- courtesy: Mahadev / Flavio ) , discussion during Jan 
> 2010 - 
> "Hi, Mahadev said it all, we have been thinking about it for a while, but
> >> haven't had time to work on it. I also don't think we have a jira open for
> >> it; at least I couldn't find one. But, we did put together some comments:
> >>
> >>    http://wiki.apache.org/hadoop/ZooKeeper/PartitionedZookeeper
> >>
> >> One of the main issues we have observed there is that partitioning will
> >> force us to change our consistency guarantees, which is far from ideal.
> >> However, some users seem to be ok with it, but I'm not sure we have
> >> agreement.
> >>
> >> In any case, please feel free to contribute or simply express your
> >> interests so that we can take them into account.
> >>
> >> Thanks,
> >> -Flavio
> >>
> >>
> >> On Jan 15, 2010, at 12:49 AM, Mahadev Konar wrote:
> >>
> > >>> Hi kay,
> > >>>  the namespace partitioning in zookeeper has been on a back burner for a
> > >>> long time. There isnt any jira open on it. There had been some
> > >>> discussions
> > >>> on this but no real work. Flavio/Ben have had this on there minds for a
> > >>> while but no real work/proposal is out yet.
> > >>>
> > >>> May I know is this something you are looking for in production?
> > >>>
> > >>> Thanks
> > >>> mahadev
> "

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to