> One other motivation for forking is that we can fix issues one time rather > than have to fix in 5 branches that have slightly different versions of our > libraries. The pain on this one is real. Spit-balling, but I wonder if there'd be a way to sustainably have all GA branches depend on this code from trunk and we use testing and validation to ensure the code on trunk stays compatible with older releases.
There's a lot of complexity there since we'd need CI updated to run that subset of tooling tests across all GA branches before a commit (i.e. trunk only changes would then potentially impact all GA branches), but maybe that actually wouldn't be so bad if we just had a new pipeline that pulled and built all GA branches from HEAD and ran through the tooling test suites against those releases. That, and it'd only really be in scope if you were making changes to that tooling. That said, it would seem pretty weird for 5.0 to need to check out code from the trunk branch to build and run tests against though... =/ > My primary need is for test utilities so my focus is there. Hm. Yeah, the more I think through this, having a versioned set of test utilities in trunk for instance would definitely feel like "crossing the streams" (i.e. PropertyTestingBase4.0, PropertyTestingBase4.1, etc). Big separation of concerns / scope failure if people working on a trunk branch in C* are having to think about other branches and API breakage with them (moreso than we already have to w/mixed version upgrades etc.) Having things like that in a separate repo where we could cut iterate on things to update for a single branch would alleviate that immediate versioning / mismatch context leak, but that introduces the inverse problem where you'd have to make a change across N branches on the shared library if you have a patch that introduces testing that hits all our GA C* and need to backport that functionality instead of changing it in one place. Blech. So as I was drafting the above, my thinking has distilled down to the following as being important to have a shared mental model on: • Do we expect the shared functionality in this lib would change frequently in ways that would impact multiple branches, or do we think it would be mostly stable for older branches and mutate more frequently on trunk? • If the former (multi-branch impacting blast radius, we keep older GA branches in sync / compatible with test harness changes), a single golden copy of the shared code that each branch shares would minimize toil • If the latter (mostly stable, trunk only changes) then having a branch of tools per GA branch would be optimal >From a workflow perspective, a shared library factored out to its own repo and >embedded into C* branches as a submodule has some attractive properties either >way. It gives you "best of both worlds" (or least-worst-option) by allowing >you to work on things seamlessly as though they were one project but keep the >branching strategies of the tooling and the dependents decoupled. Even if we >only had 1 branch of the test tooling that all C* versions depended on, having >it separate and embedded as a submodule should give us the same devx >ergonomics while preserving the option to customize per C* branch fairly >easily. On Fri, Jun 5, 2026, at 9:25 AM, David Capwell wrote: > One other motivation for forking is that we can fix issues one time rather > than have to fix in 5 branches that have slightly different versions of our > libraries. A recent example is CASSANDRA-21216 which was a bug fix for btree. > > > One of the other reasons brought up in the past is that many libraries are > needed by accord but accord can’t depend on Cassandra else we have a cyclical > dependency, so forking off let’s accord use our libraries. For the time > being accord had to fork many libraries in accord to make progress; this is a > common issue right now. > > > > Sent from my iPhone > >> On Jun 3, 2026, at 1:45 PM, Josh McKenzie <[email protected]> wrote: >> >>> delays this effort for years as we need time to get people on board and >>> used to gradle before we flip that switch. >> Oof. I'm way more optimistic on this one; if we can get a PR that has ant >> targets as dumb wrappers that instead call gradle targets (i.e. all >> workflows and local scripting Just Work), I don't see why we couldn't merge >> that as soon as we ironed out kinks. >> >> Is there anyone that's broadly against that approach? Or did I just >> misunderstand the other thread / JIRA you'd created David? >> >> On Wed, Jun 3, 2026, at 1:21 PM, David Capwell wrote: >>> Fair point but one thing to point out, if this work depends on gradle that >>> delays this effort for years as we need time to get people on board and >>> used to gradle before we flip that switch. So leaving in tree means we >>> have to hand roll all that logic in ant. >>> >>> Sent from my iPhone >>> >>>> On Jun 3, 2026, at 12:33 PM, Jon Haddad <[email protected]> wrote: >>>> >>>> Josh is right. Gradle subprojects could allow this without dealing with >>>> separate repo. I've done this before and am about to again for some stuff >>>> I maintain. I spent a long time agonozing over this for my other projects >>>> and found it works exceptionally well, especially bc you frequently >>>> develop things that are tightly coupled. >>>> >>>> Juggling repos sucks, this solves it (imo) perfectly. >>>> >>>> Jon >>>> >>>> On Tue, Jun 2, 2026 at 1:18 PM Josh McKenzie <[email protected]> wrote: >>>>> __ >>>>>> Is there a reason not to use a folder in the current repo that becomes >>>>>> its own jar? It can even be published separately if we like? >>>>> >>>>>> Mostly to decouple from Cassandra release. >>>>> I *think* we could just have that .jar release on its own cadence >>>>> independently of the parent C* project. >>>>> >>>>> Some of us have talked about taking this same approach to making some >>>>> code from C* available to the ecosystem (think I/O .jar that has SSTable >>>>> read/write, CommitLog read/write, etc). This feels like a very similarly >>>>> shaped thing. >>>>> >>>>> I assume w/a modern build / publish / etc system we'd be able to publish >>>>> a release that represents a strict subset of the parent project out of >>>>> the repo right? >>>>> >>>>> On Mon, Jun 1, 2026, at 8:18 PM, David Capwell wrote: >>>>>> Mostly to decouple from Cassandra release. If there is a feature added >>>>>> does it have to wait for the next major release of Cassandra so others >>>>>> can consume? Even if we can get to yearly releases that’s still a long >>>>>> wait. >>>>>> >>>>>> For example Alex and I have been talking about proper fuzz testing, so >>>>>> best case is a year before 3rd parties could use. >>>>>> >>>>>> Sent from my iPhone >>>>>> >>>>>>> On Jun 1, 2026, at 4:32 PM, Jeremiah Jordan <[email protected]> wrote: >>>>>>> >>>>>>> Does it need to be a separate repo? Is there a reason not to use a >>>>>>> folder in the current repo that becomes its own jar? It can even be >>>>>>> published separately if we like? >>>>>>> >>>>>>> -Jeremiah >>>>>>> >>>>>>> On Jun 1, 2026 at 10:00:15 AM, David Capwell <[email protected]> wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> We've discussed pulling utilities out of trunk before. I'd like to >>>>>>>> actually start. My primary need is for test utilities so my focus is >>>>>>>> there. >>>>>>>> >>>>>>>> This isn't just my need. Sidecar wants property/stateful tests but >>>>>>>> can't use ours without a published jar. >>>>>>>> >>>>>>>> Proposed approach: >>>>>>>> >>>>>>>> 1. Define scope — start with property/stateful test utilities >>>>>>>> 2. Set up the repo and release independently of Cassandra >>>>>>>> 3. ... >>>>>>>> 4. Cassandra depends on the library >>>>>>>> >>>>>>>> I'd focus on the fork first, before making Cassandra depend on it — >>>>>>>> keeps our builds simple and gives the lib room to stabilize. We can >>>>>>>> sort out the dependency question later (wait on releases, or use >>>>>>>> submodules?). >>>>>>>> >>>>>>>> Happy to drive this if there's interest. >>>>>>>> >>>>>>>> Sent from my iPhone >>>>> >>
