Hi,

Patrick's suggestion seems good to me.

I won't go into specifics here as I need to genuinely prepare for
this. It is quite hard to dig deep into the solutions of others and
bring some constructive criticism because it takes a lot of time to
study it and everybody has some "why's" behind it.

To summarize my goals and concerns:

1) We should be as much "Kubernetes operator idiomatic" as possible.
Industry standards, no custom brain-child of this or that group
because they think it is just cool or they just didn't know any
better. I do NOT say it is like that right now, I just want to be
ruthless here as much as possible when it comes to functionality and
why it is done like that. It is awesome that we have already something
latest (thanks to John) and it adheres to the latest releases. I
personally had a hard time to keep up with all the releases, once I
finished something and I aligned it, after a week or two there was
already another one where things were different, it is a very
fast-moving space and I hope that by time we develop something it will
not be obsolete.

2) It may be easier said than done but it is guaranteed that people
get emotional, it's their precious etc, so please let's go into this
with good intentions, not trying to push one solution over the other
just because they would like to see it there ... I will have an
equally hard time to comply with this point. My plan is to explain
what is _wrong_ with our solution. Where we made mistakes and what
should be done differently but it is "too late" etc. It is quite hard
to describe your work and all effort in this light but without telling
what is wrong we can not decide what is good imho.

3) We should put something together fast enough so we can call it a
release. We can always iterate on it for eternity. But the foundations
need to be there. Here I want to say that I especially like what John
did. I looked through these specs and it was obvious it has been
written with care and attention. It looked _solid_. I am not sure how
hard it is to put all other things on top of that, I truly do not, and
here I think we would have to reinvent that wheel if we want to
proceed because I can not imagine what it would be to retrofit e.g.
CassKop on top of John specs, it is just like putting round pegs into
the square holes, maybe some chunks would be reused easily but
otherwise I worry we will be just on square one.

One specific feeling I have as I read this is that even if there is
the will to create the fourth operator, the respective parties will
not be able to drop their own repository. The whole point behind this
effort, to me, is to have a solid, community driven, stable, modern
and feature complete operator people are truly using. I can see that
once this is real, we will _really_ sunset our operator, redirecting
people to the new operator on main readme doc etc, we truly mean it.
Sure, if somebody comes and bug fix will be needed, we will fix it,
but the whole point of doing this is to stop using what we have
currently, over time, otherwise we are just splitting this space even
more. If CassKop is not sure if they will use it because they do not
know if that operator will be "enough" for them, aren't we just doing
it wrong? If I exaggerate, they should be fine with deleting the whole
repository and using just this Cassandra one we are going to make
otherwise I don't see the point to work on this ...

On Thu, 24 Sep 2020 at 20:45, Joshua McKenzie <jmcken...@apache.org> wrote:
>
> - choose cass-operator: it is not on offer right now so let’s see if it does
>
>
> We should all talk a lot more, but this is 100% a mistake - I take the
> blame for that. The intention has long been to offer cass-operator for
> donation but it slipped through the cracks and your email yesterday made me
> double-take.
>
> We have since resolved this misalignment. DataStax would be happy to donate
> any and all of cass-operator to the ASF and C* project if it's what we all
> agree best serves our collective Cassandra users. I'm also cognizant that
> an immense amount of effort has gone into CassKop and we seem to have
> something of an embarrassment of riches.
>
> I'm given to understand (haven't dug in personally) that the two operators
> express pretty different opinions when it comes to frameworks, designs,
> supported versions, etc. I think a discrete enumeration of the feature set
> and "identities" of both could really help navigate this conversation going
> forward.
>
> Also - thanks for that context Franck. It's always helpful to know where
> other people are coming from when we're all working together towards a
> common goal.
>
>
> On Thu, Sep 24, 2020 at 12:23 PM, <franck.de...@orange.com> wrote:
>
> > I can share Orange’s view of the situation, sorry it is a long story!
> >
> > We started CassKop at the end of 2018 after betting on K8S which was not
> > so simple as far as C* was concerned. Lack of support for local storage,
> > IPs that change all the time, different network plugins to try to implement
> > a non standard K8s way of having nodes see each other from different dcs…
> > We hesitated with Mesos but could not have both and K8S was already
> > tracting so much you could not not choose it.
> >
> > Anyway, we looked around and did not see anyone with such requirements so
> > we said: why not try it ourselves but on github so that we may give it back
> > to the community. We have used C* for quite a few years with great success
> > on production with massive load and perfect availability. We love C* @
> > Orange :) Thanks!
> >
> > So we started writing support for mono-dc cluster (CassKop) and added the
> > multi dc support with MultiCassKop which is another operator included in
> > the CassKop repo. For more details we tried to document our designs as much
> > as possible here: https://orange-opensource.github.io/casskop/docs/
> > 1_concepts/3_design_principes#multi-site-management
> >
> > In the middle of last year we had some talks with Datastax about working
> > together around their new management sidecar. Their position on open source
> > was not clear at that time so we said please come back when you have
> > decided to go open source with it. Which they did in the beginning of this
> > year. But at that time I guess work had started on cass-operator so we kept
> > our separate ways.
> >
> > Since the beginning of the years, we have been working with our OPS team
> > to have it in production. It is not simple as the team has to learn K8S and
> > trust a newborn operator. This takes time especially as our internal
> > cluster has been tweaked for multi-tenancy with obscure options being set
> > by our K8s team…
> >
> > We also developed with Instaclustr the Backup & Restore functionnality (we
> > have new CRDs (Custom Resource Definition) for backup and restore and a
> > reconcile loop that calls out Instaclustr sidecar for these operations). We
> > now support multiple backups in parallel and can write to s3/ google or
> > azur (but Stefan could give more details here if needed)
> >
> > During the SIG calls we mentioned our desire to donate CassKop once it
> > satisfies our basics requirements (v1 coming just now but I said it too
> > many times already) I am actually not sure Datastax mentioned their desire
> > to donate cass-operator but we decided to compare the designs and the
> > functionalities based on respective CRDs. The CRD is the interface with the
> > user as it is where you describe the cluster that you want to have. These
> > talks were very interesting and we found out that the CassKop team had made
> > good choices most of the time but was may be too open. Indeed our intention
> > was to give all the possibilities for our OPS team to work. This includes :
> > - very open topology definition using any configuration of labels to map
> > dcs / racks and nodes to labels on clusters (we have labels on dcs / rooms
> > / rows and server racks so we can map C* racks to storage or network arrays
> > internaly)
> > - possibility to have multiple C* nodes on a single K8S host (because
> > internal clouds are not really clouds, they have limited resources)
> > - custom C* image selection,
> > - custom bootstrap script that lets you configure C* as you want using
> > ConfigMaps,
> > - the ability to mount different volumes wherever they wanted,
> > - the possibility to run any number of sidecars alongside C* for custom
> > probes in our case
> >
> > This makes CassKop quite powerful and flexible.
> > We made sure that all those options are not enabled by default so one can
> > just pop a simple 3 node cluster quickly
> >
> > On the other hand cass-operator had an interesting way of configuring C*
> > just inside the CRD using cass-config. This is simple and elegant so we are
> > implementing it as well for the support of C* 4
> >
> > Now for the future, there are 3 choices in my opinion:
> > - start from scratch (or John’s repo) by cherry picking bits from all
> > operators. This is possible but will take some time / effort to have
> > something usable. And then it will be compared to cass-operator and
> > CassKop. I don’t see Orange contributing too much here as we believe
> > CassKop to be a much better starting point
> > - choose cass-operator: it is not on offer right now so let’s see if it
> > does. I think Orange could contribute some bits inherited from CassKop if
> > it is agreed by the community. Not sure it would be enough for us to use
> > it.
> > - choose CassKop: we would be delighted to donate it and contribute with
> > some committers (including the original author who now works for AWS). It
> > would then become the community operator but there would be cass-operator
> > alongside probably. But Cass-operator is made to make it easier for
> > Datastax to manage customer clusters by imposing some configuration. It
> > make sense for their needs, so may be 2 operators. We don’t know how
> > backup/restore will be handled here with medusa being adapted to K8s
> >
> > Sorry again for being long but 2 years of work deserve some lines of text
> > :)
> >
> > I just saw your message Patrick but this was written already so we gain a
> > week.
> >
> > Franck
> >
> > On 24 Sep 2020, at 10:08, Benjamin Lerer <benjamin.le...@datastax.com
> > <mailto:benjamin.le...@datastax.com>> wrote:
> >
> > I realise there are meeting logs, but getting a wider discourse with
> > non-stakeholder input might help to build a community consensus? It doesn't
> > seem like it can hurt at this point, anyway.
> >
> > +1
> >
> > On Wed, Sep 23, 2020 at 9:21 PM Benedict Elliott Smith <benedict@apache.
> > org<mailto:bened...@apache.org>> wrote:
> >
> > Perhaps it helps to widen the field of discussion to the dev list?
> >
> > It might help if each of the stakeholder organisations state their view on
> > the situation, including why they would or would not support a given
> > approach/operator, and what (preferably specific) circumstances might lead
> > them to change their mind?
> >
> > I realise there are meeting logs, but getting a wider discourse with
> > non-stakeholder input might help to build a community consensus? It doesn't
> > seem like it can hurt at this point, anyway.
> >
> > On 23/09/2020, 17:13, "John Sanda" <john.sa...@gmail.com<mailto:john.
> > sa...@gmail.com>> wrote:
> >
> > I want to point out that pretty much everything being discussed in this
> > thread has been discussed at length during the SIG meetings. I think it is
> > worth noting because we are pretty much still have the same conversation.
> >
> > On Wed, Sep 23, 2020 at 12:03 PM Benedict Elliott Smith < benedict@apache.
> > org<mailto:bened...@apache.org>> wrote:
> >
> > I don't think there's anything about a code drop that's not "The Apache
> > Way"
> >
> > If there's a consensus (or even strong majority) amongst invested parties,
> > I don't see why we could not adopt an operator directly into the project.
> >
> > It's possible a green field approach might lead to fewer hard feelings, as
> > everyone is in the same boat. Perhaps all operators are also suboptimal
> > and
> > could be improved with a rewrite? But I think coordinating a lot of
> > different entities around an empty codebase is particularly challenging. I
> > actually think it could be better for cohesion and collaboration to have a
> > suboptimal but substantive starting point.
> >
> > On 23/09/2020, 16:11, "Stefan Miklosovic" < stefan.miklosovic@
> > instaclustr.com<mailto:stefan.mikloso...@instaclustr.com>> wrote:
> >
> > I think that from Instaclustr it was stated quite clearly multiple
> > times that we are "fine to throw it away" if there is something better
> > and more wide-spread.Indeed, we have invested a lot of time in the
> > operator but it was not useless at all, we gained a lot of quite unique
> > knowledge how to put all pieces together. However, I think that
> > this space is going to be quite fragmented and "balkanized", which is
> > not always a bad thing, but in a quite narrow area as Kubernetes operator
> > is, I just do not see how 4 operators are going to be beneficial for
> > ordinary people ("official" from community, ours, Datastax one and CassKop
> > (without any significant order)). Sure, innovation and healthy competition
> > is important but to what extent ...
> > One can start a Cassandra cluster on Kubernetes just so many times
> > differently and nobody really likes a vendor lock-in. People wanting
> > to run a cluster on K8S realise that there are three operators, each
> > backed by a private business entity, and the community operator is not
> > there ... Huh, interesting ... One may even start to question what is
> > wrong with these folks that it takes three companies to build their
> > own solution.
> >
> > Having said that, to my perception, Cassandra community just does not
> > have enough engineers nor contributors to keep 4 operators alive at
> > the same time (I wish I was wrong) so the idea of selecting the best
> > one or to merge obvious things and approaches together is understandable,
> > even if it meant we eventually sunset ours. In addition, nobody from big
> > players is going to contribute to the code
> > base of the other one, for obvious reasons, so channeling and directing
> > this effort into something common for a community seems to
> > be the only reasonable way of cooperation.
> >
> > It is quite hard to bootstrap this if the donation of the code in big
> > chunks / whole repo is out of question as it is not the "Apache way"
> > (there was some thread running here about this in more depth a while
> > ago) and we basically need to start from scratch which is quite
> > demotivating, we are just inventing the wheel and nobody is up to it.
> > It is like people are waiting for that to happen so they can jump in
> > "once it is the thing" but it will never materialise or at least the
> > hurdle to kick it off is unnecessarily high. Nobody is going to invest
> > in this heavily if there is already a working operator from companies
> > mentioned above. As I understood it, one reason of not choosing the
> > way of donating it all is that "the learning and community building
> > should happen in organic manner and we just can not accept the donation",
> > but is not it true that it is easier to build a community
> > around something which is already there rather than trying to build it
> > around an idea which is quite hard to dedicate to?
> >
> > On Wed, 23 Sep 2020 at 15:28, Joshua McKenzie < jmcken...@apache.org
> > <mailto:jmcken...@apache.org>> wrote:
> >
> > I think there's significant value to the community in trying to
> > coalesce
> > on a single approach,
> > I agree. Unfortunately in this case, the parties with a vested interest
> > and
> > written operators came to the table and couldn't agree to coalesce
> > on a
> > single approach. John Sanda attempted to start an initiative to write a
> > best-of-breed combining choice parts of each operator, but that effort did
> > not gain traction.
> >
> > Which is where my hypothesis comes from that if there were a clear
> > "better
> > fit" operator to start from we wouldn't be in a deadlock; the correct
> > choice would be obvious. Reasonably so, every engineer that's written
> > something is going to want that something to be used and not thrown
> > away in
> > favor of another something without strong evidence as to why that's
> > the
> > better choice.
> >
> > As far as I know, nobody has made a clear case as to a more compelling
> > place to start in terms of an operator donation the project then
> > collaborates on. There's no mass adoption evidence nor feature enumeration
> > that I know of for any of the approaches anyone's taken, so the
> > discussions
> > remain stalled.
> >
> > On Wed, Sep 23, 2020 at 7:18 AM, Benedict Elliott Smith < benedict@apache.
> > org<mailto:bened...@apache.org> wrote:
> >
> > I think there's significant value to the community in trying to
> > coalesce
> > on a single approach, earlier than later. This is an opportunity
> > to expand
> > the number of active organisations involved directly in the Apache
> > Cassandra project, as well as to more quickly expand the project's
> > functionality into an area we consider urgent and important. I
> > think it
> > would be a real shame to waste this opportunity. No doubt it will
> > be hard,
> > as organisations have certain built-in investments in their own
> > approaches.
> >
> > I haven't participated in these calls as I do not consider myself
> > to have
> > the relevant experience and expertise, and have other focuses on
> > the
> > project. I just wanted to voice a vote in favour of trying to bring the
> > different organisations together on a single approach if possible.
> > Is there
> > anything the project can do to help this happen?
> >
> > On 23/09/2020, 03:04, "Ben Bromhead" <b...@instaclustr.com<mailto:ben@
> > instaclustr.com>> wrote:
> >
> > I think there is certainly an appetite to donate and standardise
> > on a
> > given operator (as mentioned in this thread).
> >
> > I personally found the SIG hard to participate in due to time zones and
> > the synchronous nature of it.
> >
> > So while it was a great forum to dive into certain details for a
> > subset of
> > participants and a worthwhile endeavour, I wouldn't paint it as an
> > accurate
> > reflection of community intent.
> >
> > I don't think that any participants want to continue down the path
> > of "let
> > a thousand flowers bloom". That's why we are looking towards CasKop (as
> > well as a number of technical reasons).
> >
> > Some of the recorded meetings and outputs can also be found if you
> > are
> > interested in some primary sources
> > https://cwiki.apache.org/confluence/display/CASSANDRA/
> > Cassandra+Kubernetes+Operator+SIG
> > .
> >
> > From what I understand second-hand from talking to people on the
> > SIG
> > calls,
> >
> > there was a general inability to agree on an existing operator as a
> > starting point and not much engagement on taking best of breed
> > from the
> > various to combine them. Seems to leave us in the "let a thousand
> > flowers
> > bloom" stage of letting operators grow in the ecosystem and seeing
> > which
> > ones meet the needs of end users before talking about adopting one
> > into the
> > foundation.
> >
> > Great to hear that you folks are joining forces though! Bodes well
> > for C*
> > users that are wanting to run things on k8s.
> >
> > On Tue, Sep 22, 2020 at 4:26 AM, Ben Bromhead < b...@instaclustr.com
> > <mailto:b...@instaclustr.com>
> >
> > wrote:
> >
> > For what it's worth, a quick update from me:
> >
> > CassKop now has at least two organisations working on it substantially
> > (Orange and Instaclustr) as well as the numerous other contributors.
> >
> > Internally we will also start pointing others towards CasKop once
> > a few
> > things get merged. While we are not yet sunsetting our operator
> > yet, it
> >
> > is
> >
> > certainly looking that way.
> >
> > I'd love to see the community adopt it as a starting point for
> > working
> > towards whatever level of functionality is desired.
> >
> > Cheers
> >
> > Ben
> >
> > On Fri, Sep 11, 2020 at 2:37 PM John Sanda <
> > john.sa...@gmail.com>
> > wrote:
> >
> > On Thu, Sep 10, 2020 at 5:27 PM Josh McKenzie < jmcken...@apache.org>
> > wrote:
> >
> > There's basically 1 java driver in the C* ecosystem. We have 3? 4?
> > or
> >
> > more
> >
> > operators in the ecosystem. Has one of them hit a clear supermajority of
> > adoption that makes it the de facto default and makes sense to
> > pull it
> >
> > into
> >
> > the project?
> >
> > We as a project community were pretty slow to move on building a
> > PoV
> >
> > around
> >
> > kubernetes so we find ourselves in a situation with a bunch of
> > contenders
> > for inclusion in the project. It's not clear to me what heuristics
> > we'd
> >
> > use
> >
> > to gauge which one would be the best fit for inclusion outside
> > letting
> > community adoption speak.
> >
> > ---
> > Josh McKenzie
> >
> > We actually talked a good bit on the SIG call earlier today about
> > heuristics. We need to document what functionality an operator
> > should
> > include at level 0, level 1, etc. We did discuss this a good bit
> > during
> > some of the initial SIG meetings, but I guess it wasn't really a
> > focal
> > point at the time. I think we should also provide references to
> > existing
> > operator projects and possibly other related projects. This would
> > benefit
> > both community users as well as people working on these projects.
> >
> > - John
> >
> > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com | @instaclustr
> > <http://twitter.com/instaclustr> | (650) 284 9692
> >
> > --
> >
> > Ben Bromhead
> >
> > Instaclustr | www.instaclustr.com | @instaclustr
> > <http://twitter.com/instaclustr> | (650) 284 9692
> >
> > --------------------------------------------------------------------- To
> > unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
> > additional
> > commands, e-mail: dev-h...@cassandra.apache.org
> >
> > --------------------------------------------------------------------- To
> > unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional
> > commands, e-mail: dev-h...@cassandra.apache.org
> >
> > --------------------------------------------------------------------- To
> > unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional
> > commands, e-mail: dev-h...@cassandra.apache.org
> >
> > --
> >
> > - John
> >
> > --------------------------------------------------------------------- To
> > unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional
> > commands, e-mail: dev-h...@cassandra.apache.org
> >
> > _________________________________________________________________________________________________________________________
> >
> >
> > Ce message et ses pieces jointes peuvent contenir des informations
> > confidentielles ou privilegiees et ne doivent donc pas etre diffuses,
> > exploites ou copies sans autorisation. Si vous avez recu ce message par
> > erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les
> > pieces jointes. Les messages electroniques etant susceptibles d'alteration,
> > Orange decline toute responsabilite si ce message a ete altere, deforme ou
> > falsifie. Merci.
> >
> > This message and its attachments may contain confidential or privileged
> > information that may be protected by law; they should not be distributed,
> > used or copied without authorisation. If you have received this email in
> > error, please notify the sender and delete this message and its
> > attachments. As emails may be altered, Orange is not liable for messages
> > that have been modified, changed or falsified. Thank you.
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Reply via email to