Hi, Thanks for your reply! I've created these two JIRA tickets to track the work: https://issues.apache.org/jira/browse/MESOS-5829 https://issues.apache.org/jira/browse/MESOS-5828 I've assigned them to myself and I would appreciate a shepherd to work with.
Here's another question: I see replicated_log is using Network instead of ZookeeperNetwork while running in non-HA mode (where --quorum is hardcoded to '1'). However replicated_log is stated to be coordinated with other replica through a group of PID [1]. Does this imply that Paxos is running in multi-master mode (every node assumes itself to be coordinator)? @Avinash: I should've been more precise and said that replicated_log Network being pluggable. Basically we want to plug in our own Network implementation backed by etcd instead of zookeeper. Thanks! Jay [1] https://github.com/apache/mesos/blob/master/include/mesos/log/log.hpp#L189 Joseph Wu <[email protected]> wrote on 07/09/2016 01:54:14: > From: Joseph Wu <[email protected]> > To: dev <[email protected]> > Cc: Jie Yu <[email protected]>, Kapil Arya <[email protected]> > Date: 07/09/2016 01:54 > Subject: Re: [Replicated Log] Enable Mesos to use etcd for replicated_log > > Jay, > > (1) Looks like we missed this when we modularized the > MasterDetector/Contender [1]. We need to expand on src/master/main.cpp a > bit. > Can you file a bug? (cc: Kapil) I can shepherd if Kapil doesn't have the > cycles. > > (2) The bit of the replicated log which relies on ZK is a small portion > called the ZookeeperNetwork [2]. The job of this component is to watch the > ZK group for membership changes. Log replication messages are broadcasted > to all members in this "network abstraction". > This is also a piece that needs to be modularized. (Can you file another > bug? :) > > (3) The replicated log is something stored locally on the master (i.e. > LevelDB). The network abstraction has some similarity with the > MasterDetector, but those pieces are otherwise unrelated. > i.e. The MasterContender is the piece that decides the "coordinator" of the > replicated log. But the replicated log uses it's own implementation of > Paxos after the coordinator is chosen. > > [1] https://issues.apache.org/jira/browse/MESOS-4610 > [2] https://github.com/apache/mesos/blob/master/src/log/network.hpp#L107 > > On Fri, Jul 8, 2016 at 9:25 AM, Avinash Sridharan <[email protected]> > wrote: > > > +Jie > > > > I think replicated log uses ZK only for leader election. Hence, without ZK > > the quorum is hard-coded to 1. > > > > For (#2), trying to understand what you mean by replicated log being > > pluggable? You mean turning of replicated log on the Master for storing > > Registrar information? > > > > On Fri, Jul 8, 2016 at 2:26 AM, Jay JN Guo <[email protected]> wrote: > > > > > > > > > > > Hi, > > > > > > We are working on a Mesos module to substitute Zookeeper with Etcd. > > > Contender and detector are done through modulerized interfaces, however, > > > replicated_log is still coupled with ZK. Here are my questions: > > > > > > #1 What's the difference between replicated_log with/without ZK? Without > > > flag --zk, Log is constructed with hardcoded quorum of 1. Does it assume > > > master to be running in non-HA mode? Otherwise, we observed that znodes > > are > > > created in ZK to store log_replica information, does it help Paxos > > > coordination in some way? > > > #2 We hope to make replicated_log pluggable. Some code change need to > > > happen in Mesos upstream (interface modulerization, extra flags, etc). So > > > we wonder if someone could shepherd them? Also, it would be great if we > > > could get some help on better understanding replicated_log internals. > > > #3 Is there a plan to use replicated_log to do master contend/detect > > > instead of ZK? If yes, what's the status? > > > > > > Your help and suggestions are highly appreciated!! > > > > > > Thanks, > > > /Jay > > > > > > > > > > > -- > > Avinash Sridharan, Mesosphere > > +1 (323) 702 5245 > >
