> Why don't we just say "any new contributions go to cassandra-ecosystem, deal
> with it"
"You can catch more files with honey than with vinegar"
For a very small engineering investment, we can decouple the time horizon for
getting cassandra-ecosystem up, running, CI passing, and releases being cut.
The more we minimize a window where people cannot contribute, the more
contributions we get. We've repeatedly learned the lesson over the years that
making changes that hard block other peoples' workflows in the community is a
Bad Idea and should be our last resort.
My #1 priority through this is making it as seamless as possible for existing
contributors; I strongly believe that's part of how you get buy-in.
> If the majority of devs contributing to old repos just does not mind there is
> the ecosystem repo somewhere then it is probable all work related to the
> ecosystem being ready to be shipped will be on a few people (probably you)
> which is not ideal.
For cutover, our dependencies will be 3 things that already have extensive
prior art that we can basically copy/paste with minor tweaks to get them
working in their new home:
1. The build(s). De-duping gradle targets and updating documentation on what
to run to do what plus task grouping. Maybe we add a new combined build target
that'll build both for work straddling both projects if we're feeling feisty.
2. Getting CI working. Figuring out how to get PR's to run pipeline A
(sidecar) or B (analytics) based on what changes. Or both. We could even leave
this one manual out of the gate (user needs to go to github actions / ci and
trigger which suite they want) if it proves in any way non-trivial. Probably
won't though.
3. Getting release cuts working. Should be smallest of the 3
Since we have fully working examples of the above 3 paradigms and aren't making
underlying code structure changes before the cutover, we have 3 very modest
change sets with extensive prior art plus a trivially automatable "copy in,
commit with prior message" script.
I'm all for not having bus factor 1 but my instinct about how long getting to
cutover will take is on the order of days of work, not even weeks. And mostly
LLM-assisted, since gradle / build / CI work is very compatible with their
strengths.
On Wed, Jul 1, 2026, at 5:34 AM, Štefan Miklošovič wrote:
> Why do we actually try to catch up with what is happening in
> cassandra-{analytics,sidecar} at all? What I wanted to say is that
> once this CEP is voted on and passed, we can go to these repos and
> deprecate them right away. There does not need to be any interim
> period where the ecosystem is kind of there but not ready and this
> non-readiness might take a lot of time so we would be applying stuff
> there until it is ready to launch ... duh.
>
> Why don't we just say "any new contributions go to
> cassandra-ecosystem, deal with it" so people will naturally move
> towards that repo and make it faster to be ready because it will be in
> their interest to have it released instead of us / you trying to keep
> it aligned all the time? If the majority of devs contributing to old
> repos just does not mind there is the ecosystem repo somewhere then it
> is probable all work related to the ecosystem being ready to be
> shipped will be on a few people (probably you) which is not ideal.
>
> On Tue, Jun 30, 2026 at 1:52 PM Josh McKenzie <[email protected]> wrote:
> >
> > Do you still want to place it each under cassandra-analytics /
> > cassandra-sidecar or under cassandra-ecosystem?
> >
> > I view that downloads page as client-facing, so "prioritizing what makes
> > the most intuitive sense to an end-user" is where I default. In this
> > instance, repo structure is a black-box implementation detail users don't
> > need to know anything about, so I'd be inclined to keep them as they are
> > today. If we went with "cassandra-ecosystem" there, then that immediately
> > begs the question of why the client drivers aren't considered part of the
> > ecosystem, etc.
> >
> > I could envision a future in which we had a cassandra-ecosystem structure
> > w/all the ecosystem projects in it, but that'd be down the road if the root
> > level namespace got crowded enough to become unwieldy. Since we EoL our
> > older release lines and are bound to 3 GA C* releases, I doubt we'll reach
> > that inflection point unless our ecosystem meaningfully expands with new
> > artifacts.
> >
> > Next, what is going to be the last release of "old" analytics and sidecar?
> >
> > Hm. Good question. Honestly, if we produce a binary from the new repo and
> > it's byte-identical to the old one, I don't know that it actually matters?
> > Maybe it does and there's something I'm missing - seems like a "six of one,
> > half a dozen of another" type situation.
> >
> > It will be a little bit tricky to update the ecosystem until we are ready
> > for its release...
> >
> > I thought the same initially but the more I think about it, the more I
> > think it should be straightforward and trivially automatable. In the first
> > phase (before we cut over to using the cassandra-ecosystem repo and freeze
> > the old ones), the repo structure and all .java files should be identical
> > in cassandra-ecosystem to upstream cassandra-* excepting the change in root
> > folder path. Integrating PR's from those other repos should be as easy as
> > taking a diff, applying it directly to the new repo (albeit in a slightly
> > different path), and committing with the same message.
> >
> > So long as we prohibit any new changes in cassandra-ecosystem (i.e. all
> > non-structural, non-new CI, non-new release commits), it should retain that
> > property of upstream changes being trivially mergeable via automation. My
> > mental test of whether this works or not: we could literally, for each PR,
> > just copy and paste all changed files into the new repo and commit using
> > the message and --author of the old and be done with it.
> >
> > Keeping this decoupled and straightforward will definitely require
> > discipline and being clear on what "Phase" we're in; since moving things
> > around and munging with CI and release flows is traditionally kind of a
> > one-person-job (plus review), I'm hoping we can keep things cleanly
> > separated and staged and just grind through it.
> >
> > What do you think? Anything in my reasoning above missing anything?
> > ~Josh
> >
> > On Tue, Jun 30, 2026, at 5:23 AM, Štefan Miklošovič wrote:
> >
> > Hi Josh,
> >
> > just went through the drafted CEP of yours.
> >
> > Looks good, I am curious what we plan to do with this
> >
> > https://downloads.apache.org/cassandra/
> >
> > Do you still want to place it each under cassandra-analytics /
> > cassandra-sidecar or under cassandra-ecosystem? I think the latter
> > (dedicated cassandra-ecosystem) makes more sense, right? We might keep
> > cassandra-{analytics/sidecar} for the record there (actually, we
> > probably have to), but since we are merging into a new repository then
> > it should be released under cassandra-ecosystem dir, no?
> >
> > Next, what is going to be the last release of "old" analytics and
> > sidecar? We will do one more 0.5.0 of each the old way or the next one
> > is going to be cassandra-ecosystem? I don't mind either way. I think
> > this is something we need to agree on and will be influenced by the
> > adoption of this CEP.
> >
> > It will be a little bit tricky to update the ecosystem until we are
> > ready for its release. To offload anybody doing this, maybe it would
> > be better to instruct devs to just start to contribute to the
> > ecosystem once the ecosystem repo is created and in a buildable state.
> > We can freeze the old repos way before the ecosystem is technically
> > ready to be released, it will at least make the devs motivated to roll
> > over to the ecosystem line of thinking knowing they will not get
> > another analytics/sidecar release.
> >
> > Regards
> >
> > On Thu, Jun 25, 2026 at 7:17 PM Josh McKenzie <[email protected]> wrote:
> > >
> > > Thanks for the feedback Bernardo!
> > >
> > > What about adding some details on the “Proposed Changes” section about
> > > the current building scripts and how they are going to be merged?
> > >
> > > My instinct with stuff like this is to try and target the smallest scoped
> > > change and do things discretely. In this case that would translate into
> > > "naively merge the two build systems together, decoupling /
> > > disambiguating duplicate gradle task names, and defer fixing the build
> > > and making the devx nice to a subsequent effort". I 100% agree w/you that
> > > not only will having the projects merged really facilitate us cleaning up
> > > the build process, but also that they could use some love and attention.
> > >
> > > I've been chatting w/Yifan quite a bit about this topic in the past
> > > couple weeks - whether CEP's are forward-looking design docs or primarily
> > > mechanisms for us to get early alignment and consensus up front for
> > > potentially fraught topics. I've been approaching them as the latter
> > > (fraught early alignment) but I'm coming around to thinking that's too
> > > reductive and leaves value on the table.
> > >
> > > So with all that long-winded meta-analysis, I fully accept your point and
> > > I'll refine the draft on that front. I'm in the middle of doing exactly
> > > that on another related project and some of those changes will directly
> > > apply here (things like auto checking for localhost availability for
> > > in-jvm dtests in gradle and early failing on integration tests
> > > w/instructions on how to fix, auto-building dtest jars as part of the
> > > build process, etc).
> > >
> > > while this change does take place, new features and additions will be
> > > frozen to the soon to be deprecated repositories. But, what about
> > > security fixes, bugs, etc? Do we have a plan to address them while doing
> > > this merge?
> > >
> > > I think I should refine what I'm proposing on the timeline so it's
> > > clearer because the CEP should address this. There's basically 2 binary
> > > "phases" to this:
> > > - Pre: everyone still works on cassandra-analytics and cassandra-sidecar
> > > while someone (i.e. me (i.e. claude)) sets up cassandra-ecosystem. All
> > > work goes "upstream" and I merge it into the ecosystem repo. In theory
> > > this should be trivial since I'll just be changing locations and changing
> > > build files, not aggressively changing the code or refactoring at this
> > > time.
> > > - Post: Once we've cut a release from ecosystem and the whole stack is up
> > > and working and the code is identical to upstream (i.e. all PR's merged
> > > in, etc), we freeze the 2 upstream and all work goes into ecosystem.
> > > - Post++: We work on removing duplicated code in the merged repo and do
> > > the above "clean up and augment the build system" work.
> > >
> > > Transition from Pre to Post happens when all changes from the old repos
> > > are reflected in ecosystem, CI is green, and we successfully cut releases
> > > from it. And I email the dev list w/warning and get consensus on it.
> > > There's always 1 canonical place for the community to contribute to for
> > > sidecar or analytics work and a discrete point in time where that
> > > location cuts over.
> > >
> > > That clarify? If so I can take a crack at refining the text in the CEP to
> > > try and make that more clear.
> > >
> > > On Wed, Jun 24, 2026, at 9:00 PM, Bernardo Botella wrote:
> > >
> > > This is awesome Josh!! Thanks a lot for putting this all together. I love
> > > what I’ve read.
> > >
> > > If I had to nitpick, I’d like for it to have a little bit more attention
> > > to the actual unification of builds. What about adding some details on
> > > the “Proposed Changes” section about the current building scripts and how
> > > they are going to be merged? Right now, they are basically both copy
> > > pasta from the main Cassandra build scripts. I don’t think we should fix
> > > that copy pasta on this effort by any means, but we are at least merging
> > > those two copy pastas (sidecar and analytics) into one copy pasta
> > > (ecosystem). Having it in this CEP turns the actual building process into
> > > a first class citizen, and rightfully so in my opinion.
> > >
> > > Also, I guess that while this change does take place, new features and
> > > additions will be frozen to the soon to be deprecated repositories. But,
> > > what about security fixes, bugs, etc? Do we have a plan to address them
> > > while doing this merge? (Just in case it drags in time).
> > >
> > > Other than that, eager to see the artifacts 0.6.0 released from here. :-)
> > >
> > > Thanks!
> > > Bernardo
> > >
> > >
> > >
> > > El El mar, 23 jun 2026 a las 1:10 a. m., Josh McKenzie
> > > <[email protected]> escribió:
> > >
> > >
> > > CEP Draft:
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=435028087
> > >
> > > DISCUSS ML thread on the topic that led to the CEP:
> > > https://lists.apache.org/thread/slpn99x51yxmrhg9oncb5olq9kjqj5js
> > >
> > > From the CEP:
> > >
> > > This CEP proposes consolidating the two separate Apache Cassandra
> > > companion repositories - cassandra-analytics and cassandra-sidecar - into
> > > a single new repository, cassandra-ecosystem, and establishing the
> > > versioning, release, API-stability, and CI practices needed to make
> > > co-location safe for existing production consumers.
> > >
> > >
> > > We had some solid discussion and a pretty clear consensus on the
> > > direction. The CEP contains more opinionated language and proposals
> > > around moving from our hybrid JIRA + github project management to pure
> > > github. The API compatibility contract and release/versioning model are
> > > also new (we talked about them on the DISCUSS thread but new to the
> > > projects) so definitely curious to hear what everyone thinks now that
> > > it's more fleshed out in the CEP.
> > >
> > > And sorry it ended up being longer Ekaterina :); once I started digging
> > > into the nuts and bolts of it there's a lot of ground to cover. I really
> > > appreciate all the engagement on the previous DISCUSS thread and am
> > > looking forward to more of that same energy here.
> > >
> > > ~Josh
> > >
> > >
> >
> >
>