+1 on cassandra-ecosystem. Cassandra-buddy would be fun, but sadly, ecosystem is more on brand for what this needs to be.
+1 on a CEP just as a matter of record and consensus we can point people to when they want to participate. Patrick On Thu, Jun 4, 2026 at 9:32 AM Yifan Cai <[email protected]> wrote: > Happy to go with *cassandra-ecosystem*. The community enthusiasm for the > name is a good signal in itself. > The one mild concern I had was that "ecosystem" could imply Cassandra core > is included in scope, but I think that is easily addressed with a clear > repository description and README introduction. Consider my earlier > suggestion withdrawn. > > A CEP is a great idea, and it doesn't need to be exhaustive. It is a place > to record the decisions made in this thread, so that they are explicitly > committed to rather than informally agreed upon in a mailing list thread. > It also directly addresses Jeremiah's concern: the stability annotations > and CI enforcement mechanisms we discussed are exactly the kind of promises > that belong in a CEP, where new contributors can find them and understand > the expectations from day one. > > - Yifan > > On Thu, Jun 4, 2026 at 7:33 AM Ekaterina Dimitrova <[email protected]> > wrote: > >> The proposal for CEP comes from the outcome I see coming from this >> valuable discussion - people overall agree a merge is valuable as long as >> the concerns outlined are hashed >> >> On Thu, 4 Jun 2026 at 10:28, Ekaterina Dimitrova <[email protected]> >> wrote: >> >>> Is this CEP- worth it? >>> >>> To outline all concerns and expectations? >>> - backwards compatibility >>> - releases >>> - API >>> - repos >>> - Jira >>> - CI >>> Etc >>> >>> It can help us also to make some promises and work towards them; >>> document them more explicitly and make it easier for anyone new starting to >>> find out what the expectations are. Does it make sense? >>> >>> I mean it doesn’t have to be 10 pages CEP >>> >>> >>> On Thu, 4 Jun 2026 at 9:58, Josh McKenzie <[email protected]> wrote: >>> >>>> I prefer cassandra-ecosystem over cassandra-companion. Keeps our >>>> options more open going forward (i.e. is a driver a companion? ... no?) >>>> >>>> To your point Jeremiah, while you'd think having the 2 projects in >>>> separate repos would force us to have cleaner APIs defined between them and >>>> versioning, in practice that's not the case today. The discipline / energy >>>> required to define a clear API boundary and rev it is probably comparable >>>> between the 2 paradigms (i.e. status quo dual repo: less discipline >>>> required, more energy, monorepo: more discipline required, less energy). At >>>> the end of the day I'd posit this is something we've been very poor at as a >>>> community across our entire ecosystem. This will be a new muscle for us to >>>> build regardless of how the repos are setup. >>>> >>>> Ideally the 2 projects would be independent of one another and have a >>>> shared artifact they both depend upon and that API is how we specify >>>> compatibility. That should be relatively straightforward to do in a >>>> monorepo w/some refactoring, and if we can get to a shared library we >>>> publish from a cassandra-ecosystem repo, we can version that and then it's >>>> as simple as "if projects you're working with support the same shared >>>> library version, they are compatible". >>>> >>>> As I write that out, it strikes me that the shared information between >>>> them could in theory one day be promoted to a higher architectural tier of >>>> shared library where we factor out shared code from analytics and the >>>> sidecar, and we factor out shared code from core Cassandra that the >>>> ecosystem depends on (i.e. "cassandra-shared", or "cassandra-lib"). Then >>>> all 3 projects (+ drivers?) could take a dependency on that shared library, >>>> we rev the version of that, and compatibility is defined by that shared >>>> substrate. >>>> >>>> All very "long term down the road" considerations, but the shape of >>>> "get things closer together so they're easier to mutate and work with, then >>>> massage the structure and dependencies to make the boundaries and >>>> versioning clear through implicit structure" appeals to me. >>>> >>>> On Thu, Jun 4, 2026, at 6:00 AM, Shailaja Koppu wrote: >>>> >>>> - I like the name cassandra-ecosystem >>>> - We cannot draw dependency direction between Analytics and Sidecar. >>>> With Analytics on S3 feature, Analytics can work without Sidecar. Sidecar >>>> has many features nothing to do with Analytics. So both can be independent >>>> of each other. >>>> - The name cassandra-ecosystem allows us to integrate more such >>>> features/components into the repo >>>> >>>> >>>> >>>> > On Jun 4, 2026, at 10:50 AM, Štefan Miklošovič < >>>> [email protected]> wrote: >>>> > >>>> > That all makes sense, Yifan. >>>> > >>>> > The only issue, it is not actually an issue rather than a consequence >>>> > of doing it like that. Imagine that there is a change in Analytics but >>>> > none in Sidecar and we release a new version. That means that >>>> > Analytics would contain a new patch but Sidecar would be a "dummy" >>>> > release. We would bump the version of Sidecar just for the sake of it. >>>> > Then people trying to investigate what has changed between these >>>> > versions would realize that, awkwardly, nothing changed. >>>> > >>>> > I can live with it. It is just something to be aware of. >>>> > >>>> > On Thu, Jun 4, 2026 at 9:42 AM Yifan Cai <[email protected]> wrote: >>>> >> >>>> >> Hi all, >>>> >> >>>> >> Thanks for the great discussion so far. A few thoughts on the open >>>> questions: >>>> >> >>>> >> Naming >>>> >> >>>> >> I'd like to suggest cassandra-companion as the name for the merged >>>> repository. Both existing names create confusion in opposite directions: >>>> operational features like rolling restart and health monitoring feel out of >>>> place in cassandra-analytics (Joey's point), while a bulk read/write >>>> connector library feels out of place in cassandra-sidecar. A new neutral >>>> name avoids subordinating either project's identity to the other, and is >>>> broad enough to accommodate future additions beyond Analytics and Sidecar, >>>> without implying Cassandra core is included, as names like >>>> cassandra-ecosystem or cassandra-platform might. >>>> >> >>>> >> For the JIRA project key, CASSCOMP would be a natural fit. >>>> >> >>>> >> API Compatibility >>>> >> >>>> >> Jeremiah raises a valid concern — co-locating the client and server >>>> removes the repo boundary that previously reminded developers they are >>>> touching a public API surface. Štefan's versioning model addresses the >>>> consumer-facing question ("what runs with what") well, but we also need >>>> developer-facing guardrails to mechanically enforce the promise. I'd >>>> propose combining three layers: >>>> >> >>>> >> Versioning contract (Štefan's model): same major.minor guarantees a >>>> compatible Analytics/Sidecar pair; patch releases of Sidecar are safe to >>>> advance independently; new REST endpoints require a minor bump >>>> >> Unified version and release cadence: all modules release together >>>> under the same version number. This directly aligns with the merge's core >>>> motivation of reducing coordination overhead. The alternative, independent >>>> module versioning within the monorepo, would essentially recreate the >>>> cross-repo coordination friction we are trying to eliminate. Conveniently, >>>> Analytics and Sidecar are currently at the same version number, so there is >>>> no awkward jump or reset needed at the point of merge. >>>> >> CI enforcement: an OpenAPI contract test that fails if a change >>>> breaks the API surface relative to the previous release, plus a >>>> compatibility matrix test that runs the N-1 Analytics client against the >>>> current Sidecar server >>>> >> Stability annotations: adopt @PublicApi / @InternalApi / @Stable / >>>> @Evolving / @Deprecated annotations on the Sidecar API surface, following >>>> the pattern established by Kafka and Elasticsearch. This makes the contract >>>> explicit and discoverable in code — a developer touching an annotated >>>> method immediately sees its stability guarantee and since which version it >>>> has been public >>>> >> >>>> >> The three layers are complementary: the versioning model defines the >>>> promise, annotations mark the contract in code, and CI enforces the promise >>>> mechanically. The unified release cadence ensures the promise is always >>>> evaluated as a whole. >>>> >> >>>> >> As a side note — Cassandra core currently lacks this kind of API >>>> stability clarity, which creates real friction for downstream projects. >>>> Establishing this practice in the companion project gives us a concrete, >>>> working reference that could motivate and inform a broader Cassandra core >>>> evolution down the road. Happy to discuss that separately if there is >>>> interest. >>>> >> >>>> >> Looking forward to hearing everyone's thoughts. >>>> >> >>>> >> Thanks >>>> >> - Yifan >>>> >> >>>> >> On Wed, Jun 3, 2026 at 11:32 PM Štefan Miklošovič < >>>> [email protected]> wrote: >>>> >>> >>>> >>> Hi Jeremiah, >>>> >>> >>>> >>> for now, what I find difficult and I found myself questioning this >>>> >>> repeatedly is "what version of Sidecar can I run with Analytics?" Is >>>> >>> Sidecar 0.2.0 compatible with Analytics 0.4.0? We just don't know >>>> >>> until we run it and try. There is no compatibility matrix for what >>>> >>> goes with what. If each component is developed independently then I >>>> >>> think it will be more messy than if it was released in lock-step. >>>> >>> >>>> >>> We might establish a policy that e.g. a patch release of Sidecar is >>>> >>> compatible with whatever minor in Analytics. For example, we release >>>> >>> both Sidecar and Analytics under unified version 1.0.0. Then we will >>>> >>> release 1.0.5 of both next. So we can say that Sidecar 1.0.5 is >>>> >>> compatible with Analytics 1.0.0. Or Sidecar 1.1.5 is compatible with >>>> >>> Analytics 1.1.0. Basically, Sidecar is a standalone server app a >>>> user >>>> >>> can run without Analytics but once they are interested in Analytics >>>> >>> combo, they would need to run with respective Analytics releases. >>>> >>> >>>> >>> If we release Analytics and Sidecar 1.1.0 and you have Sidecar 1.0.5 >>>> >>> then you would need to upgrade to 1.1.0 to be sure that it is >>>> >>> compatible with Analytics 100% while you could just bump patch >>>> >>> releases for Sidecar endlessly if you are interested in Sidecar >>>> >>> without Analytics. >>>> >>> >>>> >>> This would of course mean that there would need to be awareness in >>>> >>> "will this patch I want to ship to Sidecar work in related Analytics >>>> >>> minor version when we release it?". We might also say that a new >>>> REST >>>> >>> endpoint can go only into a new minor version and similar. >>>> >>> >>>> >>> This was, of course, just an example and it is all tweakable. >>>> >>> >>>> >>> On Wed, Jun 3, 2026 at 11:44 PM Jeremiah Jordan < >>>> [email protected]> wrote: >>>> >>>>> >>>> >>>>> I worry if we move into the Sidecar repo it's just going to >>>> become more coupled and folks in the community are already using Analytics >>>> to read from e.g. S3 buckets or other data sources. >>>> >>>> >>>> >>>> >>>> >>>> I have similar concerns. If we start releasing them in lockstep >>>> from the same repo, then I worry that people will start making breaking >>>> changes to sidecar APIs such that existing Analytics jars out in the wild >>>> will not work, without realizing it. >>>> >>>> >>>> >>>> Both cassandra-analytics and the cassandra-sidecar are starting to >>>> be used out in the world by people in production settings. My expectation >>>> for updates to the sidecar APIs is that anything done should not break >>>> existing clients, when the client and the server are in different repos, it >>>> is much cleaner and clearer to people that you are exposing an API surface >>>> which is being consumed externally, and you need to keep things like >>>> backwards compatibility in mind. If the client and the server live in the >>>> same repo, and are released together, I can see people just >>>> changing/refactoring both and not considering existing clients out in the >>>> wild. I think them being in separate repos makes that distinction clearer >>>> to someone working on a new feature that spans both code bases. >>>> >>>> >>>> >>>> Seems like many here want them in the same repo, so I won’t block >>>> that, but I have concerns. >>>> >>>> >>>> >>>> If we do decide to merge them, I think it should be in a new repo >>>> with a new name. I do not think the sidecar belongs in a repo names >>>> analytics, or the analytics library belongs in a repo named sidecar. They >>>> both have use cases that do not involved the other. >>>> >>>> >>>> >>>> -Jeremiah Jordan >>>> >>>> >>>> >>>> >>>> >>>> On Jun 3, 2026 at 11:42:15 AM, James Berragan <[email protected]> >>>> wrote: >>>> >>>>> >>>> >>>>> Can we break down a bit more where the circular dependency lies, >>>> I'm not against it, I just want to make sure we're solving the right >>>> problem here. Analytics and CDC were always designed to be agnostic of the >>>> Sidecar. What stops us moving just the Sidecar specific parts into the >>>> Sidecar repo? I worry if we move into the Sidecar repo it's just going to >>>> become more coupled and folks in the community are already using Analytics >>>> to read from e.g. S3 buckets or other data sources. >>>> >>>>> >>>> >>>>> James. >>>> >>>>> >>>> >>>>> On Tue, 2 Jun 2026 at 13:20, Josh McKenzie <[email protected]> >>>> wrote: >>>> >>>>>> >>>> >>>>>> I'd like to propose we merge the cassandra-sidecar and >>>> cassandra-analytics repositories. I've shopped the idea around to some of >>>> you and gotten universally positive feedback with some questions about >>>> details we deferred to this discussion. >>>> >>>>>> >>>> >>>>>> Reasons we should merge: >>>> >>>>>> >>>> >>>>>> Break circular dependencies between the 2 projects >>>> >>>>>> Remove redundant copy/pasted code >>>> >>>>>> Simplify build and CI >>>> >>>>>> Reduce friction on changes that span both projects >>>> >>>>>> Simplify the CDC implementation >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> Outstanding questions and observations that came up: >>>> >>>>>> >>>> >>>>>> Do we merge one repository into the other? Or do we create a new >>>> project and bring them both in? >>>> >>>>>> What do we do about JIRA? Leave separate or combine? >>>> >>>>>> What do we do with open issues and PR's in github? >>>> >>>>>> We'll need to thoughtfully update CI (github + circle) since >>>> we're right at the limit on the free tier on both projects >>>> >>>>>> What do we do about existing deprecated repositories >>>> (cassandra-analytics and/or cassandra-sidecar)? >>>> >>>>>> We'll need to update our release process >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> Other observations or questions welcome, as are thoughts on the >>>> entire process, on the outstanding questions, etc. >>>> >>>>>> >>>> >>>>>> Looking forward to the discussion everyone. >>>> >>>>>> >>>> >>>>>> ~Josh >>>> >>>> >>>> >>>>
