That all makes sense, Yifan. The only issue, it is not actually an issue rather than a consequence of doing it like that. Imagine that there is a change in Analytics but none in Sidecar and we release a new version. That means that Analytics would contain a new patch but Sidecar would be a "dummy" release. We would bump the version of Sidecar just for the sake of it. Then people trying to investigate what has changed between these versions would realize that, awkwardly, nothing changed.
I can live with it. It is just something to be aware of. On Thu, Jun 4, 2026 at 9:42 AM Yifan Cai <[email protected]> wrote: > > Hi all, > > Thanks for the great discussion so far. A few thoughts on the open questions: > > Naming > > I'd like to suggest cassandra-companion as the name for the merged > repository. Both existing names create confusion in opposite directions: > operational features like rolling restart and health monitoring feel out of > place in cassandra-analytics (Joey's point), while a bulk read/write > connector library feels out of place in cassandra-sidecar. A new neutral name > avoids subordinating either project's identity to the other, and is broad > enough to accommodate future additions beyond Analytics and Sidecar, without > implying Cassandra core is included, as names like cassandra-ecosystem or > cassandra-platform might. > > For the JIRA project key, CASSCOMP would be a natural fit. > > API Compatibility > > Jeremiah raises a valid concern — co-locating the client and server removes > the repo boundary that previously reminded developers they are touching a > public API surface. Štefan's versioning model addresses the consumer-facing > question ("what runs with what") well, but we also need developer-facing > guardrails to mechanically enforce the promise. I'd propose combining three > layers: > > Versioning contract (Štefan's model): same major.minor guarantees a > compatible Analytics/Sidecar pair; patch releases of Sidecar are safe to > advance independently; new REST endpoints require a minor bump > Unified version and release cadence: all modules release together under the > same version number. This directly aligns with the merge's core motivation of > reducing coordination overhead. The alternative, independent module > versioning within the monorepo, would essentially recreate the cross-repo > coordination friction we are trying to eliminate. Conveniently, Analytics and > Sidecar are currently at the same version number, so there is no awkward jump > or reset needed at the point of merge. > CI enforcement: an OpenAPI contract test that fails if a change breaks the > API surface relative to the previous release, plus a compatibility matrix > test that runs the N-1 Analytics client against the current Sidecar server > Stability annotations: adopt @PublicApi / @InternalApi / @Stable / @Evolving > / @Deprecated annotations on the Sidecar API surface, following the pattern > established by Kafka and Elasticsearch. This makes the contract explicit and > discoverable in code — a developer touching an annotated method immediately > sees its stability guarantee and since which version it has been public > > The three layers are complementary: the versioning model defines the promise, > annotations mark the contract in code, and CI enforces the promise > mechanically. The unified release cadence ensures the promise is always > evaluated as a whole. > > As a side note — Cassandra core currently lacks this kind of API stability > clarity, which creates real friction for downstream projects. Establishing > this practice in the companion project gives us a concrete, working reference > that could motivate and inform a broader Cassandra core evolution down the > road. Happy to discuss that separately if there is interest. > > Looking forward to hearing everyone's thoughts. > > Thanks > - Yifan > > On Wed, Jun 3, 2026 at 11:32 PM Štefan Miklošovič <[email protected]> > wrote: >> >> Hi Jeremiah, >> >> for now, what I find difficult and I found myself questioning this >> repeatedly is "what version of Sidecar can I run with Analytics?" Is >> Sidecar 0.2.0 compatible with Analytics 0.4.0? We just don't know >> until we run it and try. There is no compatibility matrix for what >> goes with what. If each component is developed independently then I >> think it will be more messy than if it was released in lock-step. >> >> We might establish a policy that e.g. a patch release of Sidecar is >> compatible with whatever minor in Analytics. For example, we release >> both Sidecar and Analytics under unified version 1.0.0. Then we will >> release 1.0.5 of both next. So we can say that Sidecar 1.0.5 is >> compatible with Analytics 1.0.0. Or Sidecar 1.1.5 is compatible with >> Analytics 1.1.0. Basically, Sidecar is a standalone server app a user >> can run without Analytics but once they are interested in Analytics >> combo, they would need to run with respective Analytics releases. >> >> If we release Analytics and Sidecar 1.1.0 and you have Sidecar 1.0.5 >> then you would need to upgrade to 1.1.0 to be sure that it is >> compatible with Analytics 100% while you could just bump patch >> releases for Sidecar endlessly if you are interested in Sidecar >> without Analytics. >> >> This would of course mean that there would need to be awareness in >> "will this patch I want to ship to Sidecar work in related Analytics >> minor version when we release it?". We might also say that a new REST >> endpoint can go only into a new minor version and similar. >> >> This was, of course, just an example and it is all tweakable. >> >> On Wed, Jun 3, 2026 at 11:44 PM Jeremiah Jordan <[email protected]> wrote: >> >> >> >> I worry if we move into the Sidecar repo it's just going to become more >> >> coupled and folks in the community are already using Analytics to read >> >> from e.g. S3 buckets or other data sources. >> > >> > >> > I have similar concerns. If we start releasing them in lockstep from the >> > same repo, then I worry that people will start making breaking changes to >> > sidecar APIs such that existing Analytics jars out in the wild will not >> > work, without realizing it. >> > >> > Both cassandra-analytics and the cassandra-sidecar are starting to be used >> > out in the world by people in production settings. My expectation for >> > updates to the sidecar APIs is that anything done should not break >> > existing clients, when the client and the server are in different repos, >> > it is much cleaner and clearer to people that you are exposing an API >> > surface which is being consumed externally, and you need to keep things >> > like backwards compatibility in mind. If the client and the server live >> > in the same repo, and are released together, I can see people just >> > changing/refactoring both and not considering existing clients out in the >> > wild. I think them being in separate repos makes that distinction clearer >> > to someone working on a new feature that spans both code bases. >> > >> > Seems like many here want them in the same repo, so I won’t block that, >> > but I have concerns. >> > >> > If we do decide to merge them, I think it should be in a new repo with a >> > new name. I do not think the sidecar belongs in a repo names analytics, >> > or the analytics library belongs in a repo named sidecar. They both have >> > use cases that do not involved the other. >> > >> > -Jeremiah Jordan >> > >> > >> > On Jun 3, 2026 at 11:42:15 AM, James Berragan <[email protected]> wrote: >> >> >> >> Can we break down a bit more where the circular dependency lies, I'm not >> >> against it, I just want to make sure we're solving the right problem >> >> here. Analytics and CDC were always designed to be agnostic of the >> >> Sidecar. What stops us moving just the Sidecar specific parts into the >> >> Sidecar repo? I worry if we move into the Sidecar repo it's just going to >> >> become more coupled and folks in the community are already using >> >> Analytics to read from e.g. S3 buckets or other data sources. >> >> >> >> James. >> >> >> >> On Tue, 2 Jun 2026 at 13:20, Josh McKenzie <[email protected]> wrote: >> >>> >> >>> I'd like to propose we merge the cassandra-sidecar and >> >>> cassandra-analytics repositories. I've shopped the idea around to some >> >>> of you and gotten universally positive feedback with some questions >> >>> about details we deferred to this discussion. >> >>> >> >>> Reasons we should merge: >> >>> >> >>> Break circular dependencies between the 2 projects >> >>> Remove redundant copy/pasted code >> >>> Simplify build and CI >> >>> Reduce friction on changes that span both projects >> >>> Simplify the CDC implementation >> >>> >> >>> >> >>> Outstanding questions and observations that came up: >> >>> >> >>> Do we merge one repository into the other? Or do we create a new project >> >>> and bring them both in? >> >>> What do we do about JIRA? Leave separate or combine? >> >>> What do we do with open issues and PR's in github? >> >>> We'll need to thoughtfully update CI (github + circle) since we're right >> >>> at the limit on the free tier on both projects >> >>> What do we do about existing deprecated repositories >> >>> (cassandra-analytics and/or cassandra-sidecar)? >> >>> We'll need to update our release process >> >>> >> >>> >> >>> Other observations or questions welcome, as are thoughts on the entire >> >>> process, on the outstanding questions, etc. >> >>> >> >>> Looking forward to the discussion everyone. >> >>> >> >>> ~Josh
