> I worry that the repository becomes a box of random “ecosystem” projects or > libraries that are totally unrelated to these two in the future. We need to > be careful on what goes into that “ecosystem” repository. How bad would it be to have effectively a "giant bucket of ecosystem stuff" where the dependencies and such are clearly outlined based on their build relationships in gradle?
As an extreme thought experiment, what if all the drivers, ccm, python dtest, sidecar, and analytics all lived in the same repo? (for the record - I'm not advocating for this at all, just trying to test out the hypothetical). It'd be harder to ramp on things and figure out where things are in the repo, .md docs would be noisier, and we'd potentially have more commit collisions with each other. Assuming all the projects had modern build systems however, our ability to cut releases and effectively work on each of those in isolation from one another should be comparable right? i.e. if someone pushed a commit to a driver and I raced trying to merge a change to sidecar, it's a simple noop rebase and then push again. So. That being said, I agree we should be careful on what goes into that repository. But my hot take is that even the most extreme of "monorepo everything except core C*" shouldn't present any really obnoxious operational hurdles. On Wed, Jun 10, 2026, at 11:14 AM, Bernardo Botella wrote: > Joining this conversation a little bit late, but hey, better late than never! > > So, having been part of some early conversations for this merge, I’m totally > +1 on it! It will definitely help with releases and some operational pain > that we are suffering. > > The circular dependencies at these point tell us that it is really hard to > have analytics without sidecar, and some things from sidecar without > analytics (I’m looking at you, CDC). > > Now, having laid that out, I like the proposed names, but I worry that the > repository becomes a box of random “ecosystem” projects or libraries that are > totally unrelated to these two in the future. We need to be careful on what > goes into that “ecosystem” repository. > > Sadly, I don’t have a better proposal (other than having analytics inside > Sidecar as some of the initial suggestions on this thread), and I am > definitely not opposed to cassandra-ecosystem. But, it would be nice to have > explicit care and love on how we define the repository (naming + description > in the readme). > > Bernardo > > > El El mié, 10 jun 2026 a las 10:30 p. m., Doug Rohrer <[email protected]> > escribió: >> +1 on cassandra-ecosystem as well. Moving sidecar into analytics, or >> anaytics into sidecar, would both seem to limit the scope of what the >> project is supposed to do, and we already have non-analytics things in >> sidecar, and non-sidecar things in Analytics. >> >> However, putting the two together into something makes sense, much like >> chocolate and peanut butter: >> >> hqdefault.jpg >> Reese's Peanut Butter Cups Chocolate In My Peanut Butter 1979 TV Commercial >> HD <https://www.youtube.com/watch?v=JuCMJWSK6Bk> >> youtube.com <https://www.youtube.com/watch?v=JuCMJWSK6Bk> >> >> >> >> Doug >> >>> On Jun 9, 2026, at 3:43 PM, Josh McKenzie <[email protected]> wrote: >>> >>> This thread has been quiet for a few days. Anybody else have anything they >>> want to bring up before I start drafting up a CEP for this work? >>> >>> On Thu, Jun 4, 2026, at 12:36 PM, Patrick McFadin wrote: >>>> +1 on cassandra-ecosystem. Cassandra-buddy would be fun, but sadly, >>>> ecosystem is more on brand for what this needs to be. >>>> >>>> +1 on a CEP just as a matter of record and consensus we can point people >>>> to when they want to participate. >>>> >>>> Patrick >>>> >>>> On Thu, Jun 4, 2026 at 9:32 AM Yifan Cai <[email protected]> wrote: >>>>> Happy to go with *cassandra-ecosystem*. The community enthusiasm for the >>>>> name is a good signal in itself. >>>>> The one mild concern I had was that "ecosystem" could imply Cassandra >>>>> core is included in scope, but I think that is easily addressed with a >>>>> clear repository description and README introduction. Consider my earlier >>>>> suggestion withdrawn. >>>>> >>>>> A CEP is a great idea, and it doesn't need to be exhaustive. It is a >>>>> place to record the decisions made in this thread, so that they are >>>>> explicitly committed to rather than informally agreed upon in a mailing >>>>> list thread. >>>>> It also directly addresses Jeremiah's concern: the stability annotations >>>>> and CI enforcement mechanisms we discussed are exactly the kind of >>>>> promises that belong in a CEP, where new contributors can find them and >>>>> understand the expectations from day one. >>>>> >>>>> - Yifan >>>>> >>>>> On Thu, Jun 4, 2026 at 7:33 AM Ekaterina Dimitrova >>>>> <[email protected]> wrote: >>>>>> The proposal for CEP comes from the outcome I see coming from this >>>>>> valuable discussion - people overall agree a merge is valuable as long >>>>>> as the concerns outlined are hashed >>>>>> >>>>>> On Thu, 4 Jun 2026 at 10:28, Ekaterina Dimitrova <[email protected]> >>>>>> wrote: >>>>>>> Is this CEP- worth it? >>>>>>> >>>>>>> To outline all concerns and expectations? >>>>>>> - backwards compatibility >>>>>>> - releases >>>>>>> - API >>>>>>> - repos >>>>>>> - Jira >>>>>>> - CI >>>>>>> Etc >>>>>>> >>>>>>> It can help us also to make some promises and work towards them; >>>>>>> document them more explicitly and make it easier for anyone new >>>>>>> starting to find out what the expectations are. Does it make sense? >>>>>>> >>>>>>> I mean it doesn’t have to be 10 pages CEP >>>>>>> >>>>>>> >>>>>>> On Thu, 4 Jun 2026 at 9:58, Josh McKenzie <[email protected]> wrote: >>>>>>>> __ >>>>>>>> I prefer cassandra-ecosystem over cassandra-companion. Keeps our >>>>>>>> options more open going forward (i.e. is a driver a companion? ... no?) >>>>>>>> >>>>>>>> To your point Jeremiah, while you'd think having the 2 projects in >>>>>>>> separate repos would force us to have cleaner APIs defined between >>>>>>>> them and versioning, in practice that's not the case today. The >>>>>>>> discipline / energy required to define a clear API boundary and rev it >>>>>>>> is probably comparable between the 2 paradigms (i.e. status quo dual >>>>>>>> repo: less discipline required, more energy, monorepo: more discipline >>>>>>>> required, less energy). At the end of the day I'd posit this is >>>>>>>> something we've been very poor at as a community across our entire >>>>>>>> ecosystem. This will be a new muscle for us to build regardless of how >>>>>>>> the repos are setup. >>>>>>>> >>>>>>>> Ideally the 2 projects would be independent of one another and have a >>>>>>>> shared artifact they both depend upon and that API is how we specify >>>>>>>> compatibility. That should be relatively straightforward to do in a >>>>>>>> monorepo w/some refactoring, and if we can get to a shared library we >>>>>>>> publish from a cassandra-ecosystem repo, we can version that and then >>>>>>>> it's as simple as "if projects you're working with support the same >>>>>>>> shared library version, they are compatible". >>>>>>>> >>>>>>>> As I write that out, it strikes me that the shared information between >>>>>>>> them could in theory one day be promoted to a higher architectural >>>>>>>> tier of shared library where we factor out shared code from analytics >>>>>>>> and the sidecar, and we factor out shared code from core Cassandra >>>>>>>> that the ecosystem depends on (i.e. "cassandra-shared", or >>>>>>>> "cassandra-lib"). Then all 3 projects (+ drivers?) could take a >>>>>>>> dependency on that shared library, we rev the version of that, and >>>>>>>> compatibility is defined by that shared substrate. >>>>>>>> >>>>>>>> All very "long term down the road" considerations, but the shape of >>>>>>>> "get things closer together so they're easier to mutate and work with, >>>>>>>> then massage the structure and dependencies to make the boundaries and >>>>>>>> versioning clear through implicit structure" appeals to me. >>>>>>>> >>>>>>>> On Thu, Jun 4, 2026, at 6:00 AM, Shailaja Koppu wrote: >>>>>>>>> - I like the name cassandra-ecosystem >>>>>>>>> - We cannot draw dependency direction between Analytics and Sidecar. >>>>>>>>> With Analytics on S3 feature, Analytics can work without Sidecar. >>>>>>>>> Sidecar has many features nothing to do with Analytics. So both can >>>>>>>>> be independent of each other. >>>>>>>>> - The name cassandra-ecosystem allows us to integrate more such >>>>>>>>> features/components into the repo >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> > On Jun 4, 2026, at 10:50 AM, Štefan Miklošovič >>>>>>>>> > <[email protected]> wrote: >>>>>>>>> > >>>>>>>>> > That all makes sense, Yifan. >>>>>>>>> > >>>>>>>>> > The only issue, it is not actually an issue rather than a >>>>>>>>> > consequence >>>>>>>>> > of doing it like that. Imagine that there is a change in Analytics >>>>>>>>> > but >>>>>>>>> > none in Sidecar and we release a new version. That means that >>>>>>>>> > Analytics would contain a new patch but Sidecar would be a "dummy" >>>>>>>>> > release. We would bump the version of Sidecar just for the sake of >>>>>>>>> > it. >>>>>>>>> > Then people trying to investigate what has changed between these >>>>>>>>> > versions would realize that, awkwardly, nothing changed. >>>>>>>>> > >>>>>>>>> > I can live with it. It is just something to be aware of. >>>>>>>>> > >>>>>>>>> > On Thu, Jun 4, 2026 at 9:42 AM Yifan Cai <[email protected]> wrote: >>>>>>>>> >> >>>>>>>>> >> Hi all, >>>>>>>>> >> >>>>>>>>> >> Thanks for the great discussion so far. A few thoughts on the open >>>>>>>>> >> questions: >>>>>>>>> >> >>>>>>>>> >> Naming >>>>>>>>> >> >>>>>>>>> >> I'd like to suggest cassandra-companion as the name for the merged >>>>>>>>> >> repository. Both existing names create confusion in opposite >>>>>>>>> >> directions: operational features like rolling restart and health >>>>>>>>> >> monitoring feel out of place in cassandra-analytics (Joey's >>>>>>>>> >> point), while a bulk read/write connector library feels out of >>>>>>>>> >> place in cassandra-sidecar. A new neutral name avoids >>>>>>>>> >> subordinating either project's identity to the other, and is broad >>>>>>>>> >> enough to accommodate future additions beyond Analytics and >>>>>>>>> >> Sidecar, without implying Cassandra core is included, as names >>>>>>>>> >> like cassandra-ecosystem or cassandra-platform might. >>>>>>>>> >> >>>>>>>>> >> For the JIRA project key, CASSCOMP would be a natural fit. >>>>>>>>> >> >>>>>>>>> >> API Compatibility >>>>>>>>> >> >>>>>>>>> >> Jeremiah raises a valid concern — co-locating the client and >>>>>>>>> >> server removes the repo boundary that previously reminded >>>>>>>>> >> developers they are touching a public API surface. Štefan's >>>>>>>>> >> versioning model addresses the consumer-facing question ("what >>>>>>>>> >> runs with what") well, but we also need developer-facing >>>>>>>>> >> guardrails to mechanically enforce the promise. I'd propose >>>>>>>>> >> combining three layers: >>>>>>>>> >> >>>>>>>>> >> Versioning contract (Štefan's model): same major.minor guarantees >>>>>>>>> >> a compatible Analytics/Sidecar pair; patch releases of Sidecar are >>>>>>>>> >> safe to advance independently; new REST endpoints require a minor >>>>>>>>> >> bump >>>>>>>>> >> Unified version and release cadence: all modules release together >>>>>>>>> >> under the same version number. This directly aligns with the >>>>>>>>> >> merge's core motivation of reducing coordination overhead. The >>>>>>>>> >> alternative, independent module versioning within the monorepo, >>>>>>>>> >> would essentially recreate the cross-repo coordination friction we >>>>>>>>> >> are trying to eliminate. Conveniently, Analytics and Sidecar are >>>>>>>>> >> currently at the same version number, so there is no awkward jump >>>>>>>>> >> or reset needed at the point of merge. >>>>>>>>> >> CI enforcement: an OpenAPI contract test that fails if a change >>>>>>>>> >> breaks the API surface relative to the previous release, plus a >>>>>>>>> >> compatibility matrix test that runs the N-1 Analytics client >>>>>>>>> >> against the current Sidecar server >>>>>>>>> >> Stability annotations: adopt @PublicApi / @InternalApi / @Stable / >>>>>>>>> >> @Evolving / @Deprecated annotations on the Sidecar API surface, >>>>>>>>> >> following the pattern established by Kafka and Elasticsearch. This >>>>>>>>> >> makes the contract explicit and discoverable in code — a developer >>>>>>>>> >> touching an annotated method immediately sees its stability >>>>>>>>> >> guarantee and since which version it has been public >>>>>>>>> >> >>>>>>>>> >> The three layers are complementary: the versioning model defines >>>>>>>>> >> the promise, annotations mark the contract in code, and CI >>>>>>>>> >> enforces the promise mechanically. The unified release cadence >>>>>>>>> >> ensures the promise is always evaluated as a whole. >>>>>>>>> >> >>>>>>>>> >> As a side note — Cassandra core currently lacks this kind of API >>>>>>>>> >> stability clarity, which creates real friction for downstream >>>>>>>>> >> projects. Establishing this practice in the companion project >>>>>>>>> >> gives us a concrete, working reference that could motivate and >>>>>>>>> >> inform a broader Cassandra core evolution down the road. Happy to >>>>>>>>> >> discuss that separately if there is interest. >>>>>>>>> >> >>>>>>>>> >> Looking forward to hearing everyone's thoughts. >>>>>>>>> >> >>>>>>>>> >> Thanks >>>>>>>>> >> - Yifan >>>>>>>>> >> >>>>>>>>> >> On Wed, Jun 3, 2026 at 11:32 PM Štefan Miklošovič >>>>>>>>> >> <[email protected]> wrote: >>>>>>>>> >>> >>>>>>>>> >>> Hi Jeremiah, >>>>>>>>> >>> >>>>>>>>> >>> for now, what I find difficult and I found myself questioning this >>>>>>>>> >>> repeatedly is "what version of Sidecar can I run with Analytics?" >>>>>>>>> >>> Is >>>>>>>>> >>> Sidecar 0.2.0 compatible with Analytics 0.4.0? We just don't know >>>>>>>>> >>> until we run it and try. There is no compatibility matrix for what >>>>>>>>> >>> goes with what. If each component is developed independently then >>>>>>>>> >>> I >>>>>>>>> >>> think it will be more messy than if it was released in lock-step. >>>>>>>>> >>> >>>>>>>>> >>> We might establish a policy that e.g. a patch release of Sidecar >>>>>>>>> >>> is >>>>>>>>> >>> compatible with whatever minor in Analytics. For example, we >>>>>>>>> >>> release >>>>>>>>> >>> both Sidecar and Analytics under unified version 1.0.0. Then we >>>>>>>>> >>> will >>>>>>>>> >>> release 1.0.5 of both next. So we can say that Sidecar 1.0.5 is >>>>>>>>> >>> compatible with Analytics 1.0.0. Or Sidecar 1.1.5 is compatible >>>>>>>>> >>> with >>>>>>>>> >>> Analytics 1.1.0. Basically, Sidecar is a standalone server app a >>>>>>>>> >>> user >>>>>>>>> >>> can run without Analytics but once they are interested in >>>>>>>>> >>> Analytics >>>>>>>>> >>> combo, they would need to run with respective Analytics releases. >>>>>>>>> >>> >>>>>>>>> >>> If we release Analytics and Sidecar 1.1.0 and you have Sidecar >>>>>>>>> >>> 1.0.5 >>>>>>>>> >>> then you would need to upgrade to 1.1.0 to be sure that it is >>>>>>>>> >>> compatible with Analytics 100% while you could just bump patch >>>>>>>>> >>> releases for Sidecar endlessly if you are interested in Sidecar >>>>>>>>> >>> without Analytics. >>>>>>>>> >>> >>>>>>>>> >>> This would of course mean that there would need to be awareness in >>>>>>>>> >>> "will this patch I want to ship to Sidecar work in related >>>>>>>>> >>> Analytics >>>>>>>>> >>> minor version when we release it?". We might also say that a new >>>>>>>>> >>> REST >>>>>>>>> >>> endpoint can go only into a new minor version and similar. >>>>>>>>> >>> >>>>>>>>> >>> This was, of course, just an example and it is all tweakable. >>>>>>>>> >>> >>>>>>>>> >>> On Wed, Jun 3, 2026 at 11:44 PM Jeremiah Jordan >>>>>>>>> >>> <[email protected]> wrote: >>>>>>>>> >>>>> >>>>>>>>> >>>>> I worry if we move into the Sidecar repo it's just going to >>>>>>>>> >>>>> become more coupled and folks in the community are already >>>>>>>>> >>>>> using Analytics to read from e.g. S3 buckets or other data >>>>>>>>> >>>>> sources. >>>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>>>> >>>> I have similar concerns. If we start releasing them in lockstep >>>>>>>>> >>>> from the same repo, then I worry that people will start making >>>>>>>>> >>>> breaking changes to sidecar APIs such that existing Analytics >>>>>>>>> >>>> jars out in the wild will not work, without realizing it. >>>>>>>>> >>>> >>>>>>>>> >>>> Both cassandra-analytics and the cassandra-sidecar are starting >>>>>>>>> >>>> to be used out in the world by people in production settings. >>>>>>>>> >>>> My expectation for updates to the sidecar APIs is that anything >>>>>>>>> >>>> done should not break existing clients, when the client and the >>>>>>>>> >>>> server are in different repos, it is much cleaner and clearer to >>>>>>>>> >>>> people that you are exposing an API surface which is being >>>>>>>>> >>>> consumed externally, and you need to keep things like backwards >>>>>>>>> >>>> compatibility in mind. If the client and the server live in the >>>>>>>>> >>>> same repo, and are released together, I can see people just >>>>>>>>> >>>> changing/refactoring both and not considering existing clients >>>>>>>>> >>>> out in the wild. I think them being in separate repos makes >>>>>>>>> >>>> that distinction clearer to someone working on a new feature >>>>>>>>> >>>> that spans both code bases. >>>>>>>>> >>>> >>>>>>>>> >>>> Seems like many here want them in the same repo, so I won’t >>>>>>>>> >>>> block that, but I have concerns. >>>>>>>>> >>>> >>>>>>>>> >>>> If we do decide to merge them, I think it should be in a new >>>>>>>>> >>>> repo with a new name. I do not think the sidecar belongs in a >>>>>>>>> >>>> repo names analytics, or the analytics library belongs in a repo >>>>>>>>> >>>> named sidecar. They both have use cases that do not involved >>>>>>>>> >>>> the other. >>>>>>>>> >>>> >>>>>>>>> >>>> -Jeremiah Jordan >>>>>>>>> >>>> >>>>>>>>> >>>> >>>>>>>>> >>>> On Jun 3, 2026 at 11:42:15 AM, James Berragan >>>>>>>>> >>>> <[email protected]> wrote: >>>>>>>>> >>>>> >>>>>>>>> >>>>> Can we break down a bit more where the circular dependency >>>>>>>>> >>>>> lies, I'm not against it, I just want to make sure we're >>>>>>>>> >>>>> solving the right problem here. Analytics and CDC were always >>>>>>>>> >>>>> designed to be agnostic of the Sidecar. What stops us moving >>>>>>>>> >>>>> just the Sidecar specific parts into the Sidecar repo? I worry >>>>>>>>> >>>>> if we move into the Sidecar repo it's just going to become more >>>>>>>>> >>>>> coupled and folks in the community are already using Analytics >>>>>>>>> >>>>> to read from e.g. S3 buckets or other data sources. >>>>>>>>> >>>>> >>>>>>>>> >>>>> James. >>>>>>>>> >>>>> >>>>>>>>> >>>>> On Tue, 2 Jun 2026 at 13:20, Josh McKenzie >>>>>>>>> >>>>> <[email protected]> wrote: >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> I'd like to propose we merge the cassandra-sidecar and >>>>>>>>> >>>>>> cassandra-analytics repositories. I've shopped the idea around >>>>>>>>> >>>>>> to some of you and gotten universally positive feedback with >>>>>>>>> >>>>>> some questions about details we deferred to this discussion. >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> Reasons we should merge: >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> Break circular dependencies between the 2 projects >>>>>>>>> >>>>>> Remove redundant copy/pasted code >>>>>>>>> >>>>>> Simplify build and CI >>>>>>>>> >>>>>> Reduce friction on changes that span both projects >>>>>>>>> >>>>>> Simplify the CDC implementation >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> Outstanding questions and observations that came up: >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> Do we merge one repository into the other? Or do we create a >>>>>>>>> >>>>>> new project and bring them both in? >>>>>>>>> >>>>>> What do we do about JIRA? Leave separate or combine? >>>>>>>>> >>>>>> What do we do with open issues and PR's in github? >>>>>>>>> >>>>>> We'll need to thoughtfully update CI (github + circle) since >>>>>>>>> >>>>>> we're right at the limit on the free tier on both projects >>>>>>>>> >>>>>> What do we do about existing deprecated repositories >>>>>>>>> >>>>>> (cassandra-analytics and/or cassandra-sidecar)? >>>>>>>>> >>>>>> We'll need to update our release process >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> Other observations or questions welcome, as are thoughts on >>>>>>>>> >>>>>> the entire process, on the outstanding questions, etc. >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> Looking forward to the discussion everyone. >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> ~Josh >>>>>>>>> >>>>>>>>> >>>>>>>> >>>
