Just to be clear, in my initial proposal I said that CQL can never go away. It’s a life sentence. Knowing the upgrade cycle that many users are on, it will be 50 years before we could even try.
I feel we are at a fork here in the discussion. Fork 1: Discuss and somehow ratify that we adhere to SQL syntax for new CQL features Fork 2: Formation of a SIG or new DISCUSS thread on how to add SQL as a formal path. There have already been throwing around really good ideas and should continue. Josh wrapped it up nicely with a “Stateless layer that could serve many purposes” That’s my proposal. WDYT? Patrick > On Nov 4, 2025, at 11:18 AM, Josh McKenzie <[email protected]> wrote: > > Good point Joey; I was rather focused on the ergonomics of implicit > constraint that come with CQL vs. SQL and the gap we'd have to bridge to make > a SQL-centric world have the same design language as CQL today. > > We can't afford to drop CQL at this point unless we had an overwhelmingly > bullet-proof CQL->SQL translation layer that didn't introduce new edge cases > nor performance degradation compared to CQL directly today. Users would have > to have the ability for existing CQL applications to Just Work when migrated > onto some new paradigm where the existing CQL native protocol endpoints were > deprecated. At that point we'd just be weighing the cost of maintaining a > translation layer between API semantics vs. a translation layer between the > native protocol and the storage engine we already have today; lot of work to > just be where we are today IMO. > > We've learned the hard way that when you remove functionality from the > database it hurts a lot of users in a lot of ways and we all discussed and > broadly had a consensus to try not to remove anything going forward on the > dev ML in the past year as I recall. Removing our core query language would > be... quite the opposite of what we discussed and agreed to. > > Now - SQL layer on top of the storage engine? If people want to work on that > I think it'd be great for our ecosystem. To Chris' point, I think there's > probably appetite from users' perspectives to have different APIs to interact > with data in the storage engine, be it gRPC, GraphQL, JSON, CQL over REST, > CQL, SQL, etc. Us having a layer that allowed us to reasonably build in that > functionality would be a net win. > > On Tue, Nov 4, 2025, at 12:36 PM, Chris Lohfink wrote: >> Just throwing my 2 cents in. I'm probably in the unpopular camp of wanting >> to to move the other direction towards a grpc endpoint that is even more >> restrictive than cql. This is coming from a standpoint of needing to clean >> up after mistakes (application/modeling etc, not cassandra) than the >> standpoint of trying to sell people on using the database. I would prefer to >> see all the features and endpoints we provide work well without breaking >> than make cool demos and feature bullet points. That said I know in order >> for a database to be successful we need the cool feature sets as well. CQL >> works for now and deprecating that would be an absolute nightmare for people >> already using it (ie thrift migration was not fun for anyone). I say create >> a new entrypoint or layer, mark it experimental and allow operators to >> disable it but leave the existing CQL interface alone. >> >> Chris >> >> On Tue, Nov 4, 2025 at 10:53 AM Isaac Reath <[email protected] >> <mailto:[email protected]>> wrote: >> I share Joey's opinions on this. Many features that resemble SQL (e.g., >> indexes, materialized views) come with caveats that stem from their >> implementation details rather than the query language itself. If we expose >> these same features through SQL as they are today, I think we'd risk setting >> users up for disappointment, since they will come in with implicit >> expectations about how a given SQL feature should work based on their >> previous experience and more often than not we won't meet that expectation. >> At least with CQL we set the expectation that this is a different database, >> where familiar concepts might behave differently than you would expect. >> >> That said, in terms of a long term direction, I think having SQL support is >> a good guiding light and implementing it as a stateless component as Jeff >> suggests would help make this easier to realize. >> >> On Tue, Nov 4, 2025 at 10:23 AM Joseph Lynch <[email protected] >> <mailto:[email protected]>> wrote: >> Removing CQL is, in my opinion, completely off the table. When we deprecated >> Thrift and gave CQL as the new query language, we imposed significant pain >> on our existing functional Thrift applications to migrate to it - I feel we >> should not hurt our users like that again. >> >> I worry that we already struggle to implement the current surface area of >> CQL correctly and in a way that scales safely. For example, CQL allows us to >> create arbitrarily large partitions, but large partitions and large columns >> continue to be something our storage engine can't currently handle well. CQL >> allows us to create secondary indices for improved filter support but few >> can (or at least we struggle) to safely use them in production. We still >> struggle with how page timeouts, hedges and retries work in an idempotent >> and reliable way in our current protocol - although CQL at least gives us a >> path to implementing those. >> >> I wonder if we should focus on being excellent at the basic write and read >> operations we already support before adding more complexity at the API >> layer. I am excited by the recent proposals around unbounded partitions, >> byte ordered partitioner with safe data movement, ability to execute >> analytics queries efficiently via a separate columnar representation etc ... >> and all of those and more would likely be required to tackle SQL in any >> meaningful way. >> >> The surface area of SQL is much much wider, requiring functional >> implementation of all of that plus joins, interactive transactions and more. >> The SQL protocol itself is also quite poor for reliable communication and >> rarely has performant async clients with size based pagination, per page >> timeouts, per page hedging, incremental progress over a streaming async >> interface, pagination resumption, etc ... A lot of this difficulty stems >> from the protocol often being tied to TCP connections and the inherently >> unbounded complexity of the read interface. >> >> I guess I'm saying, I think we should prioritize succeeding at the API scope >> we already have before adding more. Deferring to standard SQL syntax or >> naming when we can just seems like a good idea (why reinvent concepts), but >> I don't think the friction with CQL is because it's not SQL, I think it's >> because users can't tell what works and what doesn't work. >> >> -Joey >> >> On Tue, Nov 4, 2025 at 8:42 AM Josh McKenzie <[email protected] >> <mailto:[email protected]>> wrote: >> >> +1 to Mick and Aleksey. I think the key for me was this: >>> One is Cassandra’s wide-partition model with flexible clustering columns, >>> which supports very large, ordered partitions (e.g. time-series and >>> efficient range scans), rather than a strictly normalised, join-centric >>> model. These patterns don’t always map cleanly to SQL semantics, and CQL’s >>> query-driven, table-per-query modelling helps move users toward designs >>> that scale predictably. >> >> We'd need really robust EXPLAIN / EXPLAIN ANALYZE support (see here >> <https://www.postgresql.org/docs/current/sql-explain.html>) for users to be >> able to make sense of how their SQL queries translate into underlying disk >> access patterns. Having a wide-open field of full SQL compliance they then >> need to understand how to constrain to get horizontal scale out of it would >> be much more challenging than the already somewhat "new" cognitive muscle >> our users have to build to realize that horizontal scaling of data access >> doesn't come free. >> >> I think that would give us a future state of "Use SQL when you need / want a >> lot of expressivity, use CQL when you need to be constrained to language >> primitives that keep your data access scalable". The part that gets me wary >> here is how we've run into pain in the past trying to be both a database >> that allows more query expressivity (ALLOW FILTERING, legacy 2i come to >> mind) and a database that also wants horizontal scale. >> >> I'd love us to be able to have our cake and eat it too but I don't know if >> that's possible. So at the very least I'd advocate for SQL + CQL going >> forward, or SQL + a constrained "CQL-like" mode that gives the same >> constraints CQL does today on modeling that guide people towards that very >> partitionable path. >> >> On Tue, Nov 4, 2025, at 8:12 AM, Aleksey Yeshchenko wrote: >>> I don’t mind us implementing some Postgres syntax support in some capacity, >>> but I do not like the idea of limiting what Cassandra is allowed to do, or >>> expose via CQL, to what is expressible by Postgres’s SQL. >>> >>> Many moons ago, before we started work on native protocol and CQL, I could >>> perhaps a bigger benefit to going Postgres route - for the client protocol >>> and the language. We could piggyback on existing client infrastructure and >>> SQL familiarity. But at this stage, when we have already made the effort to >>> develop decent drivers, and CQL is fleshed out, and C* is quite mature >>> overall, how much would we gain from this transition? >>> >>> I’m broadly with Mick here. And I support using Postgres’ SQL as >>> inspiration for implementing new CQL features wherever it makes sense - >>> it’s something we’ve been doing for a decade already. But I don’t believe >>> that deprecating CQL is the way to go at this point. >>> >>> > On 4 Nov 2025, at 06:38, Mick <[email protected] <mailto:[email protected]>> >>> > wrote: >>> > >>> > >>> > >>> >> On 3 Nov 2025, at 20:32, Joel Shepherd <[email protected] >>> >> <mailto:[email protected]>> wrote: >>> >> >>> >> At the same time, my personal opinion is that if SQL compatibility is >>> >> pursued, then the end game should be to deprecate CQL. That will >>> >> probably take years, but at the limit I don't see a lot of benefit to >>> >> supporting both. >>> > >>> > >>> > >>> > We want SQL, but _why_ (in all its nuances) do we want SQL ? A lot is >>> > obvious, but it is a very broad question. >>> > >>> > The adoption and standardisation benefits are obvious, but CQL has >>> > strengths relative to SQL in Cassandra’s context. >>> > >>> > One is Cassandra’s wide-partition model with flexible clustering columns, >>> > which supports very large, ordered partitions (e.g. time-series and >>> > efficient range scans), rather than a strictly normalised, join-centric >>> > model. These patterns don’t always map cleanly to SQL semantics, and >>> > CQL’s query-driven, table-per-query modelling helps move users toward >>> > designs that scale predictably. >>> > >>> > I can see CQL continuing as Cassandra’s high-throughput, query-driven >>> > DSL, while we pursue SQL compatibility. I appreciate Dinesh’s ‘lanes’ >>> > framing, e.g. eventually default to a SQL interface (with Accord) for the >>> > broadest UX, while CQL remains a high-throughput path. >>> > >>> > Should we also be discussing storage-engine implications ? Cassandra’s >>> > LSMT/SSTable design optimises write paths; while a SQL presents a logical >>> > view without constraining physical layout; so data on disk stays >>> > optimised for dominant access patterns. I can also see the need to >>> > discuss transport vs query languages differences. >>> > >>> > Are we after both SQL's DML and DDL abilities ? Beyond accessibility and >>> > exploration, SQL often comes with mature tooling for schema change >>> > management. Cassandra supports online schema changes (e.g., ALTER TABLE), >>> > but cross-table/primary-key changes remain constrained. A SQL interface >>> > alone won’t ‘solve’ this: it’s about migration tooling and engine >>> > capabilities; changing data models at-scale faces separate challenges. >>> > >>> > Especially outside of early-stage apps and ad-hoc exploration I find SQL >>> > less interesting and its ergonomics less aligned with Cassandra’s runtime >>> > performance model. That doesn't make me opposed to the endeavour of SQL >>> > compatibility, it pushes me on the why question a bit more for alignment >>> > clarity to our strengths. >>> >>> >> >
