Re: [Replicated Log] Enable Mesos to use etcd for replicated_log

Jay JN Guo Sun, 10 Jul 2016 03:39:13 -0700

Hi,

Thanks for your reply! I've created these two JIRA tickets to track the
work:
https://issues.apache.org/jira/browse/MESOS-5829
https://issues.apache.org/jira/browse/MESOS-5828
I've assigned them to myself and I would appreciate a shepherd to work
with.


Here's another question:
I see replicated_log is using Network instead of ZookeeperNetwork while
running in non-HA mode (where --quorum is hardcoded to '1'). However
replicated_log is stated to be coordinated with other replica through a
group of PID [1]. Does this imply that Paxos is running in multi-master
mode (every node assumes itself to be coordinator)?

@Avinash: I should've been more precise and said that replicated_log
Network being pluggable. Basically we want to plug in our own Network
implementation backed by etcd instead of zookeeper.

Thanks!
Jay

[1]
https://github.com/apache/mesos/blob/master/include/mesos/log/log.hpp#L189

Joseph Wu <[email protected]> wrote on 07/09/2016 01:54:14:

> From: Joseph Wu <[email protected]>
> To: dev <[email protected]>
> Cc: Jie Yu <[email protected]>, Kapil Arya <[email protected]>
> Date: 07/09/2016 01:54
> Subject: Re: [Replicated Log] Enable Mesos to use etcd for replicated_log
>
> Jay,
>
> (1) Looks like we missed this when we modularized the
> MasterDetector/Contender [1].  We need to expand on src/master/main.cpp a
> bit.
> Can you file a bug?  (cc: Kapil)  I can shepherd if Kapil doesn't have
the
> cycles.
>
> (2) The bit of the replicated log which relies on ZK is a small portion
> called the ZookeeperNetwork [2].  The job of this component is to watch
the
> ZK group for membership changes.  Log replication messages are
broadcasted
> to all members in this "network abstraction".
> This is also a piece that needs to be modularized.  (Can you file another
> bug? :)
>
> (3) The replicated log is something stored locally on the master (i.e.
> LevelDB).  The network abstraction has some similarity with the
> MasterDetector, but those pieces are otherwise unrelated.
> i.e. The MasterContender is the piece that decides the "coordinator" of
the
> replicated log.  But the replicated log uses it's own implementation of
> Paxos after the coordinator is chosen.
>
> [1] https://issues.apache.org/jira/browse/MESOS-4610
> [2] https://github.com/apache/mesos/blob/master/src/log/network.hpp#L107
>
> On Fri, Jul 8, 2016 at 9:25 AM, Avinash Sridharan <[email protected]>
> wrote:
>
> > +Jie
> >
> > I think replicated log uses ZK only for leader election. Hence, without
ZK
> > the quorum is hard-coded to 1.
> >
> > For (#2), trying to understand what you mean by replicated log being
> > pluggable? You mean turning of replicated log on the Master for storing
> > Registrar information?
> >
> > On Fri, Jul 8, 2016 at 2:26 AM, Jay JN Guo <[email protected]>
wrote:
> >
> > >
> > >
> > > Hi,
> > >
> > > We are working on a Mesos module to substitute Zookeeper with Etcd.
> > > Contender and detector are done through modulerized interfaces,
however,
> > > replicated_log is still coupled with ZK. Here are my questions:
> > >
> > > #1 What's the difference between replicated_log with/without ZK?
Without
> > > flag --zk, Log is constructed with hardcoded quorum of 1. Does it
assume
> > > master to be running in non-HA mode? Otherwise, we observed that
znodes
> > are
> > > created in ZK to store log_replica information, does it help Paxos
> > > coordination in some way?
> > > #2 We hope to make replicated_log pluggable. Some code change need to
> > > happen in Mesos upstream (interface modulerization, extra flags,
etc). So
> > > we wonder if someone could shepherd them? Also, it would be great if
we
> > > could get some help on better understanding replicated_log internals.
> > > #3 Is there a plan to use replicated_log to do master contend/detect
> > > instead of ZK? If yes, what's the status?
> > >
> > > Your help and suggestions are highly appreciated!!
> > >
> > > Thanks,
> > > /Jay
> > >
> >
> >
> >
> > --
> > Avinash Sridharan, Mesosphere
> > +1 (323) 702 5245
> >

Re: [Replicated Log] Enable Mesos to use etcd for replicated_log

Reply via email to