Is this CEP- worth it?

To outline all concerns and expectations?
- backwards compatibility
- releases
- API
- repos
- Jira
- CI
Etc

It can help us also to make some promises and work towards them; document
them more explicitly and make it easier for anyone new starting to find out
what the expectations are.  Does it make sense?

I mean it doesn’t have to be 10 pages CEP


On Thu, 4 Jun 2026 at 9:58, Josh McKenzie <[email protected]> wrote:

> I prefer cassandra-ecosystem over cassandra-companion. Keeps our options
> more open going forward (i.e. is a driver a companion? ... no?)
>
> To your point Jeremiah, while you'd think having the 2 projects in
> separate repos would force us to have cleaner APIs defined between them and
> versioning, in practice that's not the case today. The discipline / energy
> required to define a clear API boundary and rev it is probably comparable
> between the 2 paradigms (i.e. status quo dual repo: less discipline
> required, more energy, monorepo: more discipline required, less energy). At
> the end of the day I'd posit this is something we've been very poor at as a
> community across our entire ecosystem. This will be a new muscle for us to
> build regardless of how the repos are setup.
>
> Ideally the 2 projects would be independent of one another and have a
> shared artifact they both depend upon and that API is how we specify
> compatibility. That should be relatively straightforward to do in a
> monorepo w/some refactoring, and if we can get to a shared library we
> publish from a cassandra-ecosystem repo, we can version that and then it's
> as simple as "if projects you're working with support the same shared
> library version, they are compatible".
>
> As I write that out, it strikes me that the shared information between
> them could in theory one day be promoted to a higher architectural tier of
> shared library where we factor out shared code from analytics and the
> sidecar, and we factor out shared code from core Cassandra that the
> ecosystem depends on (i.e. "cassandra-shared", or "cassandra-lib"). Then
> all 3 projects (+ drivers?) could take a dependency on that shared library,
> we rev the version of that, and compatibility is defined by that shared
> substrate.
>
> All very "long term down the road" considerations, but the shape of "get
> things closer together so they're easier to mutate and work with, then
> massage the structure and dependencies to make the boundaries and
> versioning clear through implicit structure" appeals to me.
>
> On Thu, Jun 4, 2026, at 6:00 AM, Shailaja Koppu wrote:
>
> - I like the name cassandra-ecosystem
> - We cannot draw dependency direction between Analytics and Sidecar. With
> Analytics on S3 feature, Analytics can work without Sidecar. Sidecar has
> many features nothing to do with Analytics. So both can be independent of
> each other.
> - The name cassandra-ecosystem allows us to integrate more such
> features/components into the repo
>
>
>
> > On Jun 4, 2026, at 10:50 AM, Štefan Miklošovič <[email protected]>
> wrote:
> >
> > That all makes sense, Yifan.
> >
> > The only issue, it is not actually an issue rather than a consequence
> > of doing it like that. Imagine that there is a change in Analytics but
> > none in Sidecar and we release a new version. That means that
> > Analytics would contain a new patch but Sidecar would be a "dummy"
> > release. We would bump the version of Sidecar just for the sake of it.
> > Then people trying to investigate what has changed between these
> > versions would realize that, awkwardly, nothing changed.
> >
> > I can live with it. It is just something to be aware of.
> >
> > On Thu, Jun 4, 2026 at 9:42 AM Yifan Cai <[email protected]> wrote:
> >>
> >> Hi all,
> >>
> >> Thanks for the great discussion so far. A few thoughts on the open
> questions:
> >>
> >> Naming
> >>
> >> I'd like to suggest cassandra-companion as the name for the merged
> repository. Both existing names create confusion in opposite directions:
> operational features like rolling restart and health monitoring feel out of
> place in cassandra-analytics (Joey's point), while a bulk read/write
> connector library feels out of place in cassandra-sidecar. A new neutral
> name avoids subordinating either project's identity to the other, and is
> broad enough to accommodate future additions beyond Analytics and Sidecar,
> without implying Cassandra core is included, as names like
> cassandra-ecosystem or cassandra-platform might.
> >>
> >> For the JIRA project key, CASSCOMP would be a natural fit.
> >>
> >> API Compatibility
> >>
> >> Jeremiah raises a valid concern — co-locating the client and server
> removes the repo boundary that previously reminded developers they are
> touching a public API surface. Štefan's versioning model addresses the
> consumer-facing question ("what runs with what") well, but we also need
> developer-facing guardrails to mechanically enforce the promise. I'd
> propose combining three layers:
> >>
> >> Versioning contract (Štefan's model): same major.minor guarantees a
> compatible Analytics/Sidecar pair; patch releases of Sidecar are safe to
> advance independently; new REST endpoints require a minor bump
> >> Unified version and release cadence: all modules release together under
> the same version number. This directly aligns with the merge's core
> motivation of reducing coordination overhead. The alternative, independent
> module versioning within the monorepo, would essentially recreate the
> cross-repo coordination friction we are trying to eliminate. Conveniently,
> Analytics and Sidecar are currently at the same version number, so there is
> no awkward jump or reset needed at the point of merge.
> >> CI enforcement: an OpenAPI contract test that fails if a change breaks
> the API surface relative to the previous release, plus a compatibility
> matrix test that runs the N-1 Analytics client against the current Sidecar
> server
> >> Stability annotations: adopt @PublicApi / @InternalApi / @Stable /
> @Evolving / @Deprecated annotations on the Sidecar API surface, following
> the pattern established by Kafka and Elasticsearch. This makes the contract
> explicit and discoverable in code — a developer touching an annotated
> method immediately sees its stability guarantee and since which version it
> has been public
> >>
> >> The three layers are complementary: the versioning model defines the
> promise, annotations mark the contract in code, and CI enforces the promise
> mechanically. The unified release cadence ensures the promise is always
> evaluated as a whole.
> >>
> >> As a side note — Cassandra core currently lacks this kind of API
> stability clarity, which creates real friction for downstream projects.
> Establishing this practice in the companion project gives us a concrete,
> working reference that could motivate and inform a broader Cassandra core
> evolution down the road. Happy to discuss that separately if there is
> interest.
> >>
> >> Looking forward to hearing everyone's thoughts.
> >>
> >> Thanks
> >> - Yifan
> >>
> >> On Wed, Jun 3, 2026 at 11:32 PM Štefan Miklošovič <
> [email protected]> wrote:
> >>>
> >>> Hi Jeremiah,
> >>>
> >>> for now, what I find difficult and I found myself questioning this
> >>> repeatedly is "what version of Sidecar can I run with Analytics?" Is
> >>> Sidecar 0.2.0 compatible with Analytics 0.4.0? We just don't know
> >>> until we run it and try. There is no compatibility matrix for what
> >>> goes with what. If each component is developed independently then I
> >>> think it will be more messy than if it was released in lock-step.
> >>>
> >>> We might establish a policy that e.g. a patch release of Sidecar is
> >>> compatible with whatever minor in Analytics. For example, we release
> >>> both Sidecar and Analytics under unified version 1.0.0. Then we will
> >>> release 1.0.5 of both next. So we can say that Sidecar 1.0.5 is
> >>> compatible with Analytics 1.0.0. Or Sidecar 1.1.5 is compatible with
> >>> Analytics 1.1.0. Basically, Sidecar is a standalone server app a user
> >>> can run without Analytics but once they are interested in Analytics
> >>> combo, they would need to run with respective Analytics releases.
> >>>
> >>> If we release Analytics and Sidecar 1.1.0 and you have Sidecar 1.0.5
> >>> then you would need to upgrade to 1.1.0 to be sure that it is
> >>> compatible with Analytics 100% while you could just bump patch
> >>> releases for Sidecar endlessly if you are interested in Sidecar
> >>> without Analytics.
> >>>
> >>> This would of course mean that there would need to be awareness in
> >>> "will this patch I want to ship to Sidecar work in related Analytics
> >>> minor version when we release it?". We might also say that a new REST
> >>> endpoint can go only into a new minor version and similar.
> >>>
> >>> This was, of course, just an example and it is all tweakable.
> >>>
> >>> On Wed, Jun 3, 2026 at 11:44 PM Jeremiah Jordan <[email protected]>
> wrote:
> >>>>>
> >>>>> I worry if we move into the Sidecar repo it's just going to become
> more coupled and folks in the community are already using Analytics to read
> from e.g. S3 buckets or other data sources.
> >>>>
> >>>>
> >>>> I have similar concerns.  If we start releasing them in lockstep from
> the same repo, then I worry that people will start making breaking changes
> to sidecar APIs such that existing Analytics jars out in the wild will not
> work, without realizing it.
> >>>>
> >>>> Both cassandra-analytics and the cassandra-sidecar are starting to be
> used out in the world by people in production settings.  My expectation for
> updates to the sidecar APIs is that anything done should not break existing
> clients, when the client and the server are in different repos, it is much
> cleaner and clearer to people that you are exposing an API surface which is
> being consumed externally, and you need to keep things like backwards
> compatibility in mind.  If the client and the server live in the same repo,
> and are released together, I can see people just changing/refactoring both
> and not considering existing clients out in the wild.  I think them being
> in separate repos makes that distinction clearer to someone working on a
> new feature that spans both code bases.
> >>>>
> >>>> Seems like many here want them in the same repo, so I won’t block
> that, but I have concerns.
> >>>>
> >>>> If we do decide to merge them, I think it should be in a new repo
> with a new name.  I do not think the sidecar belongs in a repo names
> analytics, or the analytics library belongs in a repo named sidecar.  They
> both have use cases that do not involved the other.
> >>>>
> >>>> -Jeremiah Jordan
> >>>>
> >>>>
> >>>> On Jun 3, 2026 at 11:42:15 AM, James Berragan <[email protected]>
> wrote:
> >>>>>
> >>>>> Can we break down a bit more where the circular dependency lies, I'm
> not against it, I just want to make sure we're solving the right problem
> here. Analytics and CDC were always designed to be agnostic of the Sidecar.
> What stops us moving just the Sidecar specific parts into the Sidecar repo?
> I worry if we move into the Sidecar repo it's just going to become more
> coupled and folks in the community are already using Analytics to read from
> e.g. S3 buckets or other data sources.
> >>>>>
> >>>>> James.
> >>>>>
> >>>>> On Tue, 2 Jun 2026 at 13:20, Josh McKenzie <[email protected]>
> wrote:
> >>>>>>
> >>>>>> I'd like to propose we merge the cassandra-sidecar and
> cassandra-analytics repositories. I've shopped the idea around to some of
> you and gotten universally positive feedback with some questions about
> details we deferred to this discussion.
> >>>>>>
> >>>>>> Reasons we should merge:
> >>>>>>
> >>>>>> Break circular dependencies between the 2 projects
> >>>>>> Remove redundant copy/pasted code
> >>>>>> Simplify build and CI
> >>>>>> Reduce friction on changes that span both projects
> >>>>>> Simplify the CDC implementation
> >>>>>>
> >>>>>>
> >>>>>> Outstanding questions and observations that came up:
> >>>>>>
> >>>>>> Do we merge one repository into the other? Or do we create a new
> project and bring them both in?
> >>>>>> What do we do about JIRA? Leave separate or combine?
> >>>>>> What do we do with open issues and PR's in github?
> >>>>>> We'll need to thoughtfully update CI (github + circle) since we're
> right at the limit on the free tier on both projects
> >>>>>> What do we do about existing deprecated repositories
> (cassandra-analytics and/or cassandra-sidecar)?
> >>>>>> We'll need to update our release process
> >>>>>>
> >>>>>>
> >>>>>> Other observations or questions welcome, as are thoughts on the
> entire process, on the outstanding questions, etc.
> >>>>>>
> >>>>>> Looking forward to the discussion everyone.
> >>>>>>
> >>>>>> ~Josh
>
>
>
>

Reply via email to