Hey was just a little confused as to if I'm waiting for your next response
or if you wanted me to respond…

Besides leader election and network membership, ZooKeeper is also utilized
in some JNI code through ZooKeeperStorage. But I'm not sure if those JNI
libraries are actually used.

So if we could put all ZooKeeper-dependent functionality behind a module
interface and implement a few liboffkv-based modules, would that suffice?

What is the sort of timeframe for your end? - And are we waiting on you, or
do you want us to prepare the contributions, send it through, then await
your review?

PS: Happy to schedule a videoconference between our teams

Samuel Marks
Charity <https://sydneyscientific.org> | consultancy <https://offscale.io>
| open-source <https://github.com/offscale> | LinkedIn
<https://linkedin.com/in/samuelmarks>


On Sat, Jun 13, 2020 at 11:12 AM Benjamin Mahler <bmah...@apache.org> wrote:

> Ah yes I forgot, the other piece is network membership for the replicated
> log, through our zookeeper::Group related code. Is that what you're
> referring to?
>
> We could put that behind a module interface as well.
>
> On Fri, Jun 12, 2020 at 9:10 PM Benjamin Mahler <bmah...@apache.org>
> wrote:
>
> > > Apache ZooKeeper is used for a number of different things in Mesos,
> with
> > > only leader election being customisable with modules. Your existing
> > modular
> > > functionality is insufficient for decoupling from Apache ZooKeeper.
> >
> > Can you clarify which other functionality you're referring to? Mesos only
> > relies on ZK for leader election and detection. We do have some libraries
> > available in the code for storing the registry in ZK but we do not
> support
> > that currently.
> >
> > On Thu, Jun 11, 2020 at 11:02 PM Samuel Marks <sam...@offscale.io>
> wrote:
> >
> >> Apache ZooKeeper is used for a number of different things in Mesos, with
> >> only leader election being customisable with modules. Your existing
> >> modular
> >> functionality is insufficient for decoupling from Apache ZooKeeper.
> >>
> >> We are ready and waiting to develop here.
> >>
> >> As mentioned over our off-mailing-list communiqué:
> >>
> >> The main advantages—and reasoning—for my investment into Mesos has been
> >> [the prospect of]:
> >>
> >>    - Making it performant and low-resource utilising on a very small
> >> number
> >>    of nodes… potentially even down to 1 node so that it can 'compete'
> with
> >>    Docker Compose.
> >>    - Reducing the number of distributed systems that all do the same
> thing
> >>    in a datacentre environment.
> >>       - Postgres has its own consensus, Docker—e.g, via Kubernetes or
> >>       Compose—has its own consensus, ZooKeeper has its own consensus,
> >> other
> >>       things like distributed filesystems… they too; have their own
> >> consensus.
> >>    - The big sell from that first point is actually showing people how
> to
> >>    run Mesos and use it for their regular day-to-day development, e.g.:
> >>    1. Context switching when the one engineer is on multiple projects
> >>       2. …then use the same technology at scale.
> >>    - The big sell from that second point is to reduce the network
> traffic,
> >>    speed up each systems consensus—through all using the one system—and
> >>    simplify analytics.
> >>
> >>    This would be a big deal for your bigger clients, who can easily
> >>    quantify what this network traffic costs, and what a reduction in
> >> network
> >>    traffic with a corresponding increase in speed would mean.
> >>
> >>    Eventually this will mean that Ops people can tradeoff guarantees for
> >>    speed (and vice-versa).
> >>    - Supporting ZooKeeper, Consul, and etcd is just the start.
> >>    - Supporting Mesos is just the start.
> >>    - We plan on adding more consensus-guaranteeing systems—maybe even
> our
> >>    own Paxos and Raft—and adding this to systems in the Mesos ecosystem
> >> like
> >>    Chronos, Marathon, and Aurora.
> >>    It is my understanding that a big part of Mesosphere's rebranding is
> >>    Kubernetes related.
> >>
> >> Recently—well, just before COVID19!—I spoke at the Sydney Kubernetes
> >> Meetup
> >> at Google. They too—including Google—were excited by the prospect of
> >> removing etcd as a hard-dependency, and supporting all the different
> ones
> >> liboffkv supports.
> >>
> >> I have the budget, team, and expertise at the ready to invest and
> >> contribute these changes. If there are certain design patterns and
> >> refactors you want us to commit to along the way, just say the word.
> >>
> >> Excitedly yours,
> >>
> >> Samuel Marks
> >> Charity <https://sydneyscientific.org> | consultancy <
> https://offscale.io
> >> >
> >> | open-source <https://github.com/offscale> | LinkedIn
> >> <https://linkedin.com/in/samuelmarks>
> >>
> >>
> >> On Wed, Jun 10, 2020 at 1:42 AM Benjamin Mahler <bmah...@apache.org>
> >> wrote:
> >>
> >> > AndreiS just reminded me that we have module interfaces for the master
> >> > detector and contender:
> >> >
> >> >
> >> >
> >>
> https://github.com/apache/mesos/blob/1.9.0/include/mesos/module/detector.hpp
> >> >
> >> >
> >>
> https://github.com/apache/mesos/blob/1.9.0/include/mesos/module/contender.hpp
> >> >
> >> >
> >> >
> >>
> https://github.com/apache/mesos/blob/1.9.0/include/mesos/master/detector.hpp
> >> >
> >> >
> >>
> https://github.com/apache/mesos/blob/1.9.0/include/mesos/master/contender.hpp
> >> >
> >> > These should allow you to implement the integration with your library,
> >> we
> >> > may need to adjust the interfaces a little, but this will let you get
> >> what
> >> > you need done without the burden on us to shepherd the work.
> >> >
> >> > On Fri, May 22, 2020 at 8:38 PM Samuel Marks <sam...@offscale.io>
> >> wrote:
> >> >
> >> > > Following on from the discussion on GitHub and here on the
> >> mailing-list,
> >> > > here is the proposal from me and my team:
> >> > > ------------------------------
> >> > >
> >> > > Choice of approach
> >> > >
> >> > > The “mediator” of every interaction with ZooKeeper in Mesos is the
> >> > > ZooKeeper
> >> > > class, declared in include/mesos/zookeeper/zookeeper.hpp.
> >> > >
> >> > > Of note are the following two differences in the *styles* of API
> >> provided
> >> > > by ZooKeeper class and liboffkv:
> >> > >
> >> > >    -
> >> > >
> >> > >    Push-style mechanism of notifications on changes in “watched”
> data,
> >> > >    versus pull-style one in liboffkv. In Mesos, the notifications
> are
> >> > >    delivered via the Watcher interface, defined in the same file as
> >> > >    ZooKeeper. This interface has the process method, which is
> invoked
> >> by
> >> > an
> >> > >    instance of ZooKeeper at most once for each watch. There is also
> a
> >> > >    special event which informs the watcher that the connection has
> >> been
> >> > >    dropped. An optional instance of Watcher is passed to the
> >> constructor
> >> > of
> >> > >    ZooKeeper.
> >> > >    -
> >> > >
> >> > >    Asynchronous session establishment process in ZooKeeper versus
> >> > >    synchronous one (if at all — e.g. for Consul there is no concept
> of
> >> > >    “session” currently defined at all) in liboffkv.
> >> > >
> >> > > The two users of the ZooKeeper are:
> >> > >
> >> > >    1.
> >> > >
> >> > >    GroupProcess;
> >> > >    2.
> >> > >
> >> > >    ZooKeeperStorageProcess.
> >> > >
> >> > > We will thus evaluate the possible approaches of integrating
> liboffkv
> >> > into
> >> > > Mesos through the prism of details of their usage.
> >> > >
> >> > > The two possible approaches are:
> >> > >
> >> > >    1.
> >> > >
> >> > >    Replace all usages of ZooKeeper with liboffkv-specific code under
> >> > #ifdef
> >> > >    guards.
> >> > >
> >> > >    This approach would scale badly, as alternative liboffkv-specific
> >> > >    implementations will be needed for both of the users.
> >> > >
> >> > >    Moreover, we think that conditional compilation results in
> >> maintenance
> >> > >    nightmare; see, e.g.:
> >> > >    -
> >> > >
> >> > >       RealWaitForChar() in vim <https://geoff.greer.fm/vim/>;
> >> > >       -
> >> > >
> >> > >       “#ifdef Considered Harmful, or Portability Experience With C
> >> News”
> >> > >       paper by Henry Spencer and Geoff Collyer
> >> > >       <
> >> http://doc.cat-v.org/henry_spencer/ifdef_considered_harmful.pdf>.
> >> > >
> >> > >    The creators of the C programming language, which introduced the
> >> > concept
> >> > >    in the first place, have also spoken against conditional
> >> compilation:
> >> > >    -
> >> > >
> >> > >       In “The Practice of Programming” by Brian W. Kernighan and Rob
> >> > Pike,
> >> > >       the following advice is given: “Avoid conditional compilation.
> >> > > Conditional
> >> > >       compilation with #ifdef and similar preprocessor directives is
> >> hard
> >> > >       to manage, because information tends to get sprinkled
> throughout
> >> > the
> >> > >       source.”
> >> > >       -
> >> > >
> >> > >       In “Plan 9 from Bell Labs” paper by Rob Pike, Ken Thompson et
> >> al.
> >> > >       <
> https://pdos.csail.mit.edu/archive/6.824-2012/papers/plan9.pdf
> >> >,
> >> > > the
> >> > >       following is said: “Conditional compilation, even with #ifdef,
> >> is
> >> > >       used sparingly in Plan 9. The only architecture-dependent
> >> #ifdefs
> >> > in
> >> > >       the system are in low-level routines in the graphics library.
> >> > > Instead, we
> >> > >       avoid such dependencies or, when necessary, isolate them in
> >> > > separate source
> >> > >       files or libraries. Besides making code hard to read, #ifdefs
> >> make
> >> > it
> >> > >       impossible to know what source is compiled into the binary or
> >> > whether
> >> > >       source protected by them will compile or work properly. They
> >> > > make it harder
> >> > >       to maintain software.”
> >> > >       2.
> >> > >
> >> > >    Modify the *implementation* of the ZooKeeper class to use
> liboffkv,
> >> > >    possibly renaming the class to something akin to KvClient to
> >> reflect
> >> > the
> >> > >    fact that would no longer be ZooKeeper-specific (this also
> includes
> >> > the
> >> > >    renaming of error codes and other similar nomenclature). The old
> >> > > version of
> >> > >    the implementation would be put under an #ifdef guard, thus
> >> minimising
> >> > >    the number — and maintenance impact — of #ifdefs.
> >> > >
> >> > > Naturally there are some advantages to taking the ifdef approach,
> >> namely
> >> > > that we can guarantee no difference in builds between before
> >> offscale's
> >> > > contribution and after, unless a compiler flag is provided.
> >> > >
> >> > > However to avoid polluting the code, we are recommending the second
> >> > > approach.
> >> > > Incompatibilities
> >> > >
> >> > > The following is the list of incompatibilities between the
> interfaces
> >> of
> >> > > ZooKeeper class and liboffkv. Some of those features should be
> >> > implemented
> >> > > in liboffkv; others should be emulated inside the ZooKeeper/KvClient
> >> > class;
> >> > > and for others still, the change of the interface of
> >> ZooKeeper/KvClient
> >> > is
> >> > > the preferred solution.
> >> > >
> >> > >    -
> >> > >
> >> > >    Asynchronous session establishment. We propose to emulate this
> >> through
> >> > >    spawning a new thread in the constructor of ZooKeeper/KvClient.
> >> > >    -
> >> > >
> >> > >    Push-style watch notification API. We propose to emulate this
> >> through
> >> > >    spawning a new thread for each watch; such a thread would then do
> >> the
> >> > > wait
> >> > >    and then invoke watcher->process() under a mutex. The number of
> >> > threads
> >> > >    should not be a concern here, as the only user that uses watches
> at
> >> > all
> >> > > (
> >> > >    GroupProcess) only registers at most one watch.
> >> > >    -
> >> > >
> >> > >    Multiple servers in URL string. We propose to implement this in
> >> > > liboffkv.
> >> > >    -
> >> > >
> >> > >    Authentication. We propose to implement this in liboffkv.
> >> > >    -
> >> > >
> >> > >    ACLs (access control lists). The following ACLs are in fact used
> >> for
> >> > >    everything:
> >> > >
> >> > >    _auth.isSome()
> >> > >        ? zookeeper::EVERYONE_READ_CREATOR_ALL
> >> > >        : ZOO_OPEN_ACL_UNSAFE
> >> > >
> >> > >    We thus propose to:
> >> > >    1.
> >> > >
> >> > >       implement rudimentary support for ACLs in liboffkv in the form
> >> of
> >> > an
> >> > >       optional parameter to create(),
> >> > >
> >> > >           bool protect_modify = false
> >> > >
> >> > >       2.
> >> > >
> >> > >       change the interface of ZooKeeper/KvClient so that
> >> protect_modify
> >> > >       flag is used instead of ACLs.
> >> > >       -
> >> > >
> >> > >    Configurable session timeout. We propose to implement this in
> >> > liboffkv.
> >> > >    -
> >> > >
> >> > >    Getting the actual session timeout, which might differ from the
> >> > >    user-provided as a result of timeout negotiation with server. We
> >> > > propose to
> >> > >    implement this in liboffkv.
> >> > >    -
> >> > >
> >> > >    Getting the session ID. We propose to implement this in liboffkv,
> >> with
> >> > >    session ID being std::string; and to modify the interface
> >> accordingly.
> >> > >    It is possible to hash a string into a 64-bit number, but in the
> >> > >    circumstances given, we think it is just not worth it.
> >> > >    -
> >> > >
> >> > >    Getting the status of the connection to the server. We propose to
> >> > >    implement this in liboffkv.
> >> > >    -
> >> > >
> >> > >    Sequenced nodes. We propose to emulate this in the class. Here is
> >> the
> >> > >    pseudo-code of our solution:
> >> > >
> >> > >    while (true) {
> >> > >        [counter, version] = get("/counter")
> >> > >        seqnum = counter + 1
> >> > >        name = "label" + seqnum
> >> > >        try {
> >> > >            commit {
> >> > >                check "/counter" version,
> >> > >                set "/counter" seqnum,
> >> > >                create name value
> >> > >            }
> >> > >            break
> >> > >        } catch (TxnAborted) {}
> >> > >    }
> >> > >
> >> > >    -
> >> > >
> >> > >    “Recursive” creation of each parent in create(), akin to mkdir
> -p.
> >> > This
> >> > >    is already emulated in the class, as ZooKeeper does not natively
> >> > support
> >> > >    it; we propose to extend this emulation to work with liboffkv.
> >> > >    -
> >> > >
> >> > >    The semantics of the “set” operation if the entry does not exist:
> >> > >    ZooKeeper fails with ZNONODE in this case, while liboffkv
> creates a
> >> > new
> >> > >    node. We propose to emulate this in-class with a transaction.
> >> > >    -
> >> > >
> >> > >    The semantics of the “erase” operation: ZooKeeper fails with
> >> ZNOTEMPTY
> >> > >    if node has children, while liboffkv removes the subtree
> >> recursively.
> >> > As
> >> > >    neither of users ever attempts to remove node with children, we
> >> > propose
> >> > > to
> >> > >    change the interface so that it declares (and actually
> implements)
> >> the
> >> > >    liboffkv-compatible semantics.
> >> > >    -
> >> > >
> >> > >    Return of ZooKeeper-specific Stat structures instead of just
> >> versions.
> >> > >    As both users only use the version field of this structure, we
> >> propose
> >> > > to
> >> > >    simply alter the interface so that only the version is returned.
> >> > >    -
> >> > >
> >> > >    Explicit “session drop” operation that also immediately erases
> all
> >> the
> >> > >    “leased” nodes. We propose to implement this in liboffkv.
> >> > >    -
> >> > >
> >> > >    Check if the node being created has leased parent. Currently,
> >> liboffkv
> >> > >    declares this to be unspecified behavior: it may either throw (if
> >> > > ZooKeeper
> >> > >    is used as the back-end) or successfully create the node
> >> (otherwise).
> >> > As
> >> > >    neither of users ever attempts to create such a node, we propose
> to
> >> > > leave
> >> > >    this as is.
> >> > >
> >> > > Estimates
> >> > > We estimate that—including tests—this will be ready by the end of
> next
> >> > > month.
> >> > > ------------------------------
> >> > >
> >> > > Open to alternative suggestions, otherwise we'll begin.
> >> > > Samuel Marks
> >> > > Charity <https://sydneyscientific.org> | consultancy <
> >> > https://offscale.io>
> >> > > | open-source <https://github.com/offscale> | LinkedIn
> >> > > <https://linkedin.com/in/samuelmarks>
> >> > >
> >> > >
> >> > > On Sat, May 2, 2020 at 4:04 AM Benjamin Mahler <bmah...@apache.org>
> >> > wrote:
> >> > >
> >> > > > So it sounds like:
> >> > > >
> >> > > > Zookeeper: Official C library has an async API. Are we gaining a
> lot
> >> > with
> >> > > > the third party C++ wrapper you pointed to? Maybe it "just works",
> >> but
> >> > it
> >> > > > looks very inactive and it's hard to tell how maintained it is.
> >> > > >
> >> > > > Consul: No official C or C++ library. Only some third party C++
> ones
> >> > that
> >> > > > look pretty inactive. The ppconsul one you linked to does have an
> >> issue
> >> > > > about an async API, I commented on it:
> >> > > > https://github.com/oliora/ppconsul/issues/26.
> >> > > >
> >> > > > etcd: Can use gRPC c++ client async API.
> >> > > >
> >> > > > Since 2 of 3 provide an async API already, I would lean more
> >> towards an
> >> > > > async API so that we don't have to change anything with the mesos
> >> code
> >> > > when
> >> > > > the last one gets an async implementation. However,  we currently
> >> use
> >> > the
> >> > > > synchronous ZK API so I realize this would be more work to first
> >> adjust
> >> > > the
> >> > > > mesos code to use the async zookeeper API. I agree that a
> >> synchronous
> >> > > > interface is simpler to start with since that will be an easier
> >> > > integration
> >> > > > and we currently do not perform many concurrent operations (and
> >> > probably
> >> > > > won't anytime soon).
> >> > > >
> >> > > > On Sun, Apr 26, 2020 at 11:17 PM Samuel Marks <sam...@offscale.io
> >
> >> > > wrote:
> >> > > >
> >> > > > > In terms of asynchronous vs synchronous interfacing, when we
> >> started
> >> > > > > liboffkv, it had an asynchronous interface. Then we decided to
> >> drop
> >> > it
> >> > > > and
> >> > > > > implemented a synchronous one, due to the dependent libraries
> >> which
> >> > > > > liboffkv uses under the hood.
> >> > > > >
> >> > > > > Our ZooKeeper implementation uses the zookeeper-cpp library
> >> > > > > <https://github.com/tgockel/zookeeper-cpp>—a well-maintained
> C++
> >> > > wrapper
> >> > > > > around common Zookeeper C bindings [which we contributed to
> vcpkg
> >> > > > > <https://github.com/microsoft/vcpkg/pull/7001>]. It has an
> >> > > asynchronous
> >> > > > > interface based on std::future
> >> > > > > <https://en.cppreference.com/w/cpp/thread/future>. Since
> >> std::future
> >> > > > does
> >> > > > > not provide chaining or any callbacks, a Zookeeper-specific
> result
> >> > > cannot
> >> > > > > be asynchronously mapped to liboffkv result. In early versions
> of
> >> > > > liboffkv
> >> > > > > we used thread pool to do the mapping.
> >> > > > >
> >> > > > > Consul implementation is based on the ppconsul
> >> > > > > <https://github.com/oliora/ppconsul> library [which we
> >> contributed
> >> > to
> >> > > > > vcpkg
> >> > > > > <
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/microsoft/vcpkg/pulls?q=is%3Apr+author%3ASamuelMarks+ppconsul
> >> > > > > >],
> >> > > > > which in turn utilizes libcurl <https://curl.haxx.se/libcurl>.
> >> > > > > Unfortunately, ppconsul uses libcurl's easy interface, and
> >> > consequently
> >> > > > it
> >> > > > > is synchronous by design. Again, in the early version of the
> >> library
> >> > we
> >> > > > > used a thread pool to overcome this limitation.
> >> > > > >
> >> > > > > As for etcd, we autogenerated the gRPC C++ client
> >> > > > > <https://github.com/offscale/etcd-client-cpp> [which we
> >> contributed
> >> > to
> >> > > > > vcpkg
> >> > > > > <https://github.com/microsoft/vcpkg/pull/6999>]. gRPC provides
> an
> >> > > > > asynchronous interface, so a "fair" async client can be
> >> implemented
> >> > on
> >> > > > top
> >> > > > > of it.
> >> > > > >
> >> > > > > To sum up, the chosen toolkit provided two of three
> >> implementations
> >> > > > require
> >> > > > > thread pool. After careful consideration, we have preferred to
> >> give
> >> > the
> >> > > > > user control over threading and opted out of the asynchrony.
> >> > > > >
> >> > > > > Nevertheless, there are some options. zookeeper-cpp allows
> >> building
> >> > > with
> >> > > > > custom futures/promises, so we can create a custom build to use
> in
> >> > > > > liboffkv/Mesos. Another variant is to use plain C ZK bindings
> >> > > > > <
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://gitbox.apache.org/repos/asf?p=zookeeper.git;a=tree;f=zookeeper-client/zookeeper-client-c;h=c72b57355c977366edfe11304067ff35f5cf215d;hb=HEAD
> >> > > > > >
> >> > > > > instead of the C++ library.
> >> > > > > As for the Consul client, the only meaningful option is to opt
> >> out of
> >> > > > using
> >> > > > > ppconsul and operate through libcurl's multi interface.
> >> > > > >
> >> > > > > At this point implementing asynchronous interfaces will require
> >> > > rewriting
> >> > > > > liboffkv from the ground up. I can allocate the budget for doing
> >> > this,
> >> > > > as I
> >> > > > > have done to date. However, it would be good to have some more
> >> > > > > back-and-forth before reengaging.
> >> > > > >
> >> > > > > Design Doc:
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://docs.google.com/document/d/1NOfyt7NzpMxxatdFs3f9ixKUS81DHHDVEKBbtVfVi_0
> >> > > > > [feel free to add it to
> >> > > > > http://mesos.apache.org/documentation/latest/design-docs/]
> >> > > > >
> >> > > > > Thanks,
> >> > > > >
> >> > > > > *SAMUEL MARKS*
> >> > > > > Sydney Medical School | Westmead Institute for Medical Research
> |
> >> > > > > https://linkedin.com/in/samuelmarks
> >> > > > > Director | Sydney Scientific Foundation Ltd <
> >> > > > https://sydneyscientific.org>
> >> > > > > | Offscale.io of Sydney Scientific Pty Ltd <https://offscale.io
> >
> >> > > > >
> >> > > > > PS: Damien - not against contributing to FoundationDB, but
> >> priorities
> >> > > are
> >> > > > > Mesos and the Mesos ecosystem, followed by Kuberentes and its
> >> > > ecosystem.
> >> > > > >
> >> > > > > On Tue, Apr 21, 2020 at 3:19 AM Benjamin Mahler <
> >> bmah...@apache.org>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Samuel: One more thing I forgot to mention, we would prefer to
> >> use
> >> > an
> >> > > > > > asynchronous client interface rather than a synchronous one.
> Is
> >> > that
> >> > > > > > something you have considered?
> >> > > > > >
> >> > > > > > On Fri, Apr 17, 2020 at 6:11 PM Vinod Kone <
> >> vinodk...@apache.org>
> >> > > > wrote:
> >> > > > > >
> >> > > > > > > Hi Samuel,
> >> > > > > > >
> >> > > > > > > Thanks for showing interest in contributing to the project.
> >> > Having
> >> > > > > > > optionality between ZooKeeper and Etcd would be great for
> the
> >> > > project
> >> > > > > and
> >> > > > > > > something that has been brought up a few times before, as
> you
> >> > > noted.
> >> > > > > > >
> >> > > > > > > I echo everything that BenM said. As part of the design it
> >> would
> >> > be
> >> > > > > great
> >> > > > > > > to see the migration path for users currently using Mesos
> with
> >> > > > > ZooKeeper
> >> > > > > > to
> >> > > > > > > Etcd. Ideally, the migration can happen without much user
> >> > > > intervention.
> >> > > > > > >
> >> > > > > > > Additionally, from our past experience, efforts like these
> are
> >> > more
> >> > > > > > > successful if the people writing the code have experience
> with
> >> > how
> >> > > > > things
> >> > > > > > > work in Mesos code base. So I would recommend starting
> small,
> >> > maybe
> >> > > > > have
> >> > > > > > a
> >> > > > > > > few engineers work on a couple "newbie" tickets and do some
> >> small
> >> > > > > > projects
> >> > > > > > > and have those committed to the project. That gives the
> >> > committers
> >> > > > some
> >> > > > > > > level of confidence about quality of the code and be more
> >> open to
> >> > > > > bigger
> >> > > > > > > changes like etcd integration. It would also help
> contributors
> >> > get
> >> > > a
> >> > > > > > better
> >> > > > > > > feeling for the lay of the land and see if they are truly
> >> > > interested
> >> > > > in
> >> > > > > > > maintaining this piece of integration for the long haul.
> This
> >> is
> >> > a
> >> > > > bit
> >> > > > > > of a
> >> > > > > > > longer path but I think it would be more a fruitful one.
> >> > > > > > >
> >> > > > > > > Looking forward to seeing new contributions to Mesos
> including
> >> > the
> >> > > > > above
> >> > > > > > > design!
> >> > > > > > >
> >> > > > > > > Thanks,
> >> > > > > > >
> >> > > > > > > On Fri, Apr 17, 2020 at 4:52 PM Samuel Marks <
> >> sam...@offscale.io
> >> > >
> >> > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Happy to build a design doc,
> >> > > > > > > >
> >> > > > > > > > To answer your question on what Offscale.io is, it's my
> >> > software
> >> > > > and
> >> > > > > > > > biomedical engineering consultancy. Currently it's still
> >> rather
> >> > > > > small,
> >> > > > > > > with
> >> > > > > > > > only 8 engineers, but I'm expecting & preparing to grow
> >> > rapidly.
> >> > > > > > > >
> >> > > > > > > > My philosophy is always open-source and patent-free, so
> >> that's
> >> > > what
> >> > > > > my
> >> > > > > > > > consultancy—and for that matter, the charitable research
> >> that I
> >> > > > fund
> >> > > > > > > > through it <https://sydneyscientific.org>—follows.
> >> > > > > > > >
> >> > > > > > > > The goal of everything we create is: interoperable
> >> > > (cross-platform,
> >> > > > > > > > cross-technology, cross-language, multi-cloud);
> open-source
> >> > > > > (Apache-2.0
> >> > > > > > > OR
> >> > > > > > > > MIT); with a view towards scaling:
> >> > > > > > > >
> >> > > > > > > >    - teams;
> >> > > > > > > >    - software-development <https://compilers.com.au>;
> >> > > > > > > >    - infrastructure [this proposed Mesos contribution +
> our
> >> > > DevOps
> >> > > > > > > > tooling];
> >> > > > > > > >    - [in the charity's case] facilitating very large-scale
> >> > > medical
> >> > > > > > > >    diagnostic screening.
> >> > > > > > > >
> >> > > > > > > > Technologies like Mesos we expect to both optimise
> resource
> >> > > > > > > > allocation—reducing costs and increasing data locality—and
> >> > award
> >> > > us
> >> > > > > > > > 'bragging rights' with which we can gain clients that are
> >> > already
> >> > > > > using
> >> > > > > > > > Mesos (which, from my experience, is always big
> corporates…
> >> > > though
> >> > > > > > > > hopefully contributions like these will make it attractive
> >> to
> >> > > small
> >> > > > > > > > companies also).
> >> > > > > > > >
> >> > > > > > > > So no, we're not going anywhere, and are planning to
> >> maintain
> >> > > this
> >> > > > > > > library
> >> > > > > > > > into the future
> >> > > > > > > >
> >> > > > > > > > PS: Once accepted by Mesos, we'll be making similar
> >> > contributions
> >> > > > to
> >> > > > > > > other
> >> > > > > > > > Mesos ecosystem projects like Chronos <
> >> > > > > https://mesos.github.io/chronos
> >> > > > > > >,
> >> > > > > > > > Marathon <https://github.com/mesosphere/marathon>, and
> >> Aurora
> >> > > > > > > > <https://github.com/aurora-scheduler/aurora> as well as
> to
> >> > > > unrelated
> >> > > > > > > > projects (e.g., removing etcd as a hard-dependency from
> >> > > Kubernetes
> >> > > > > > > > <https://kubernetes.io>… enabling them to choose between
> >> > > > ZooKeeper,
> >> > > > > > > etcd,
> >> > > > > > > > and Consul).
> >> > > > > > > >
> >> > > > > > > > Thanks for your continual feedback,
> >> > > > > > > >
> >> > > > > > > > *SAMUEL MARKS*
> >> > > > > > > > Sydney Medical School | Westmead Institute for Medical
> >> > Research |
> >> > > > > > > > https://linkedin.com/in/samuelmarks
> >> > > > > > > > Director | Sydney Scientific Foundation Ltd <
> >> > > > > > > https://sydneyscientific.org>
> >> > > > > > > > | Offscale.io of Sydney Scientific Pty Ltd <
> >> > https://offscale.io>
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > On Sat, Apr 18, 2020 at 6:58 AM Benjamin Mahler <
> >> > > > bmah...@apache.org>
> >> > > > > > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > Oh ok, could you tell us a little more about how you're
> >> using
> >> > > > > Mesos?
> >> > > > > > > And
> >> > > > > > > > > what offscale.io is?
> >> > > > > > > > >
> >> > > > > > > > > Strictly speaking, we don't really need packaging and
> >> > releases
> >> > > as
> >> > > > > we
> >> > > > > > > can
> >> > > > > > > > > bundle the dependency in our repo and that's what we do
> >> for
> >> > > many
> >> > > > of
> >> > > > > > our
> >> > > > > > > > > dependencies.
> >> > > > > > > > > To me, the most important thing is the commitment to
> >> maintain
> >> > > the
> >> > > > > > > library
> >> > > > > > > > > and address issues that come up.
> >> > > > > > > > > I also would lean more towards a run-time flag rather
> >> than a
> >> > > > build
> >> > > > > > > level
> >> > > > > > > > > flag, if possible.
> >> > > > > > > > >
> >> > > > > > > > > I think the best place to start would be to put
> together a
> >> > > design
> >> > > > > > doc.
> >> > > > > > > > The
> >> > > > > > > > > act of writing that will force the author to think
> through
> >> > the
> >> > > > > > details
> >> > > > > > > > (and
> >> > > > > > > > > there are a lot of them!), and we'll then get a chance
> to
> >> > give
> >> > > > > > > feedback.
> >> > > > > > > > > You can look through the mailing list for past examples
> of
> >> > > design
> >> > > > > > docs
> >> > > > > > > > (in
> >> > > > > > > > > terms of which sections to include, etc).
> >> > > > > > > > >
> >> > > > > > > > > How does that sound?
> >> > > > > > > > >
> >> > > > > > > > > On Tue, Apr 14, 2020 at 8:44 PM Samuel Marks <
> >> > > sam...@offscale.io
> >> > > > >
> >> > > > > > > wrote:
> >> > > > > > > > >
> >> > > > > > > > > > Dear Benjamin Mahler [and *Developers mailing-list for
> >> > Apache
> >> > > > > > > Mesos*],
> >> > > > > > > > > >
> >> > > > > > > > > > Thanks for responding so quickly.
> >> > > > > > > > > >
> >> > > > > > > > > > Actually this entire project I invested—time & money,
> >> > > > including a
> >> > > > > > > > > > development team—explicitly in order to contribute
> this
> >> to
> >> > > > Apache
> >> > > > > > > > Mesos.
> >> > > > > > > > > So
> >> > > > > > > > > > no releases yet, because I wanted to ensure it was up
> to
> >> > the
> >> > > > > > > > > specification
> >> > > > > > > > > > requirements referenced in dev@mesos.apache.org
> before
> >> > > > > proceeding
> >> > > > > > > with
> >> > > > > > > > > > packaging and releases.
> >> > > > > > > > > >
> >> > > > > > > > > > Tests have been setup in Travis CI for Linux (Ubuntu
> >> 18.04)
> >> > > and
> >> > > > > > > macOS,
> >> > > > > > > > > > happy to set them up elsewhere also. There are also
> some
> >> > > > Windows
> >> > > > > > > builds
> >> > > > > > > > > > that need a bit of tweaking, then they will be pushed
> >> into
> >> > CI
> >> > > > > also.
> >> > > > > > > We
> >> > > > > > > > > are
> >> > > > > > > > > > just starting to do some work on reducing build & test
> >> > times.
> >> > > > > > > > > >
> >> > > > > > > > > > Would be great to build a checklist of things you want
> >> to
> >> > see
> >> > > > > > before
> >> > > > > > > we
> >> > > > > > > > > > send the PR, e.g.,
> >> > > > > > > > > >
> >> > > > > > > > > >    - ☐ hosted docs;
> >> > > > > > > > > >    - ☐ CI/CD—including packaging—for Windows, Linux,
> and
> >> > > macOS;
> >> > > > > > > > > >    - ☐ releases on GitHub;
> >> > > > > > > > > >    - ☐ consistent session and auth interface
> >> > > > > > > > > >    - ☐ different tests [can you expand here?]
> >> > > > > > > > > >
> >> > > > > > > > > > This is just an example checklist, would be best if
> you
> >> and
> >> > > > > others
> >> > > > > > > can
> >> > > > > > > > > > flesh it out, so when we do send the PR it's in an
> >> > > immediately
> >> > > > > > > mergable
> >> > > > > > > > > > state.
> >> > > > > > > > > >
> >> > > > > > > > > > BTW: Originally had a debate with my team about
> whether
> >> to
> >> > > > send a
> >> > > > > > PR
> >> > > > > > > > out
> >> > > > > > > > > of
> >> > > > > > > > > > the blue—like Microsoft famously did for Node.js
> >> > > > > > > > > > <https://github.com/nodejs/node/pull/4765>—or start
> an
> >> > > *offer
> >> > > > > > > thread*
> >> > > > > > > > on
> >> > > > > > > > > > the developers mailing-list.
> >> > > > > > > > > >
> >> > > > > > > > > > Looking forward to contributing 🦀
> >> > > > > > > > > >
> >> > > > > > > > > > *SAMUEL MARKS*
> >> > > > > > > > > > Sydney Medical School | Westmead Institute for Medical
> >> > > > Research |
> >> > > > > > > > > > https://linkedin.com/in/samuelmarks
> >> > > > > > > > > > Director | Sydney Scientific Foundation Ltd <
> >> > > > > > > > > https://sydneyscientific.org>
> >> > > > > > > > > > | Offscale.io of Sydney Scientific Pty Ltd <
> >> > > > https://offscale.io>
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > On Wed, Apr 15, 2020 at 2:38 AM Benjamin Mahler <
> >> > > > > > bmah...@apache.org>
> >> > > > > > > > > > wrote:
> >> > > > > > > > > >
> >> > > > > > > > > > > Thanks for reaching out, a well maintained and well
> >> > written
> >> > > > > > wrapper
> >> > > > > > > > > > > interface to the three backends would certainly make
> >> this
> >> > > > > easier
> >> > > > > > > for
> >> > > > > > > > us
> >> > > > > > > > > > vs
> >> > > > > > > > > > > implementing such an interface ourselves.
> >> > > > > > > > > > >
> >> > > > > > > > > > > Is this the client interface?
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/offscale/liboffkv/blob/d31181a1e74c5faa0b7f5d7001879640b4d9f111/liboffkv/client.hpp#L115-L142
> >> > > > > > > > > > >
> >> > > > > > > > > > > At a quick glance, three ZK things that we rely on
> but
> >> > seem
> >> > > > to
> >> > > > > be
> >> > > > > > > > > absent
> >> > > > > > > > > > > from the common interface is the ZK session,
> >> > > authentication,
> >> > > > > and
> >> > > > > > > > > > > authorization. How will these be provided via the
> >> common
> >> > > > > > interface?
> >> > > > > > > > > > >
> >> > > > > > > > > > > Here is our ZK interface wrapper if you want to see
> >> what
> >> > > > kinds
> >> > > > > of
> >> > > > > > > > > things
> >> > > > > > > > > > we
> >> > > > > > > > > > > use:
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/apache/mesos/blob/1.9.0/include/mesos/zookeeper/zookeeper.hpp#L72-L339
> >> > > > > > > > > > >
> >> > > > > > > > > > > The project has 0 releases and 0 issues, what kind
> of
> >> > usage
> >> > > > has
> >> > > > > > it
> >> > > > > > > > > seen?
> >> > > > > > > > > > > Has there been any testing yet? Would Offscale.io be
> >> > doing
> >> > > > some
> >> > > > > > of
> >> > > > > > > > the
> >> > > > > > > > > > > testing?
> >> > > > > > > > > > >
> >> > > > > > > > > > > On Mon, Apr 13, 2020 at 7:54 PM Samuel Marks <
> >> > > > > sam...@offscale.io
> >> > > > > > >
> >> > > > > > > > > wrote:
> >> > > > > > > > > > >
> >> > > > > > > > > > > > Apache ZooKeeper <https://zookeeper.apache.org>
> is
> >> a
> >> > > large
> >> > > > > > > > > dependency.
> >> > > > > > > > > > > > Enabling developers and operations to use etcd <
> >> > > > > > https://etcd.io
> >> > > > > > > >,
> >> > > > > > > > > > Consul
> >> > > > > > > > > > > > <https://consul.io>, or ZooKeeper should reduce
> >> > resource
> >> > > > > > > > utilisation
> >> > > > > > > > > > and
> >> > > > > > > > > > > > enable new use cases.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > There have already been a number of suggestions to
> >> get
> >> > > rid
> >> > > > of
> >> > > > > > > hard
> >> > > > > > > > > > > > dependency on ZooKeeper. For example, see:
> >> MESOS-1806
> >> > > > > > > > > > > > <https://issues.apache.org/jira/browse/MESOS-1806
> >,
> >> > > > > MESOS-3574
> >> > > > > > > > > > > > <https://issues.apache.org/jira/browse/MESOS-3574
> >,
> >> > > > > MESOS-3797
> >> > > > > > > > > > > > <https://issues.apache.org/jira/browse/MESOS-3797
> >,
> >> > > > > MESOS-5828
> >> > > > > > > > > > > > <https://issues.apache.org/jira/browse/MESOS-5828
> >,
> >> > > > > MESOS-5829
> >> > > > > > > > > > > > <https://issues.apache.org/jira/browse/MESOS-5829
> >.
> >> > > > However,
> >> > > > > > > there
> >> > > > > > > > > are
> >> > > > > > > > > > > > difficulties in supporting a few implementations
> for
> >> > > > > different
> >> > > > > > > > > services
> >> > > > > > > > > > > > with quite distinct data models.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > A few months ago offscale.io invested in a
> >> solution to
> >> > > > this
> >> > > > > > > > problem
> >> > > > > > > > > -
> >> > > > > > > > > > > > liboffkv <https://github.com/offscale/liboffkv>
> – a
> >> > > *C++*
> >> > > > > > > library
> >> > > > > > > > > > which
> >> > > > > > > > > > > > provides a *uniform interface over ZooKeeper,
> >> Consul KV
> >> > > and
> >> > > > > > > etcd*.
> >> > > > > > > > It
> >> > > > > > > > > > > > abstracts common features of these services into
> its
> >> > own
> >> > > > data
> >> > > > > > > model
> >> > > > > > > > > > which
> >> > > > > > > > > > > > is very similar to ZooKeeper’s one. Careful
> >> attention
> >> > was
> >> > > > > paid
> >> > > > > > to
> >> > > > > > > > > keep
> >> > > > > > > > > > > > methods both efficient and consistent. It is
> >> > > > cross-platform,
> >> > > > > > > > > > > > open-source (*Apache-2.0
> >> > > > > > > > > > > > OR MIT*), and is written in C++, with vcpkg
> >> packaging,
> >> > *C
> >> > > > > > library
> >> > > > > > > > > > output
> >> > > > > > > > > > > > <
> >> > > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > >
> >> > > > >
> >> > >
> >>
> https://github.com/offscale/liboffkv/blob/d3d549e/CMakeLists.txt#L29-L35
> >> > > > > > > > > > > > >*,
> >> > > > > > > > > > > > and additional interfaces in *Go <
> >> > > > > > > > > https://github.com/offscale?q=goffkv
> >> > > > > > > > > > > >*,
> >> > > > > > > > > > > > *Java
> >> > > > > > > > > > > > <https://github.com/offscale/liboffkv-java>*, and
> >> > *Rust
> >> > > > > > > > > > > > <https://github.com/offscale/rsoffkv>*.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Offscale.io proposes to replace all ZooKeeper
> >> usages in
> >> > > > Mesos
> >> > > > > > > with
> >> > > > > > > > > > usages
> >> > > > > > > > > > > > of liboffkv. Since all interactions which require
> >> > > ZooKeeper
> >> > > > > in
> >> > > > > > > > Mesos
> >> > > > > > > > > > are
> >> > > > > > > > > > > > conducted through the class Group (and
> GroupProcess)
> >> > > with a
> >> > > > > > clear
> >> > > > > > > > > > > interface
> >> > > > > > > > > > > > the obvious way to introduce changes is to provide
> >> > > another
> >> > > > > > > > > > implementation
> >> > > > > > > > > > > > of the class which uses liboffkv instead of
> >> ZooKeeper.
> >> > In
> >> > > > > this
> >> > > > > > > case
> >> > > > > > > > > the
> >> > > > > > > > > > > > original implementation may be left unchanged in
> the
> >> > > > codebase
> >> > > > > > and
> >> > > > > > > > > build
> >> > > > > > > > > > > > flags to select from ZK-only and liboffkv variants
> >> may
> >> > be
> >> > > > > > > > introduced.
> >> > > > > > > > > > > Once
> >> > > > > > > > > > > > the community is confident, you can decide to
> remove
> >> > the
> >> > > > > > ZK-only
> >> > > > > > > > > > option,
> >> > > > > > > > > > > > and instead only support liboffkv [which
> internally
> >> has
> >> > > > build
> >> > > > > > > flags
> >> > > > > > > > > for
> >> > > > > > > > > > > > each service].
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Removing the hard dependency on ZooKeeper will
> >> simplify
> >> > > > local
> >> > > > > > > > > > deployment
> >> > > > > > > > > > > > for testing purposes as well as enable using Mesos
> >> in
> >> > > > > clusters
> >> > > > > > > > > without
> >> > > > > > > > > > > > ZooKeeper, e.g. where etcd or Consul is used for
> >> > > > > coordination.
> >> > > > > > We
> >> > > > > > > > > > expect
> >> > > > > > > > > > > > this to greatly reduce the amount of
> >> resource—network,
> >> > > CPU,
> >> > > > > > disk,
> >> > > > > > > > > > > > memory—usage in a datacenter environment.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > If the community accepts the initiative, we will
> >> > > integrate
> >> > > > > > > liboffkv
> >> > > > > > > > > > into
> >> > > > > > > > > > > > Mesos. We are also ready to develop the library
> and
> >> > > > consider
> >> > > > > > any
> >> > > > > > > > > > > suggested
> >> > > > > > > > > > > > improvements.
> >> > > > > > > > > > > > *SAMUEL MARKS*
> >> > > > > > > > > > > > Sydney Medical School | Westmead Institute for
> >> Medical
> >> > > > > > Research |
> >> > > > > > > > > > > > https://linkedin.com/in/samuelmarks
> >> > > > > > > > > > > > Director | Sydney Scientific Foundation Ltd <
> >> > > > > > > > > > > https://sydneyscientific.org>
> >> > > > > > > > > > > > | Offscale.io of Sydney Scientific Pty Ltd <
> >> > > > > > https://offscale.io>
> >> > > > > > > > > > > > *SYDNEY SCIENTIFIC FOUNDATION and THE UNIVERSITY
> OF
> >> > > SYDNEY*
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > PS: We will be offering similar contributions to
> >> > Chronos
> >> > > > > > > > > > > > <https://mesos.github.io/chronos>, Marathon
> >> > > > > > > > > > > > <https://github.com/mesosphere/marathon>, Aurora
> >> > > > > > > > > > > > <https://github.com/aurora-scheduler/aurora>, and
> >> > > related
> >> > > > > > > > projects.
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Reply via email to