There are two distinct conversations in this thread. 1. What does the evolution of CQL Syntax look like? 2. What is the path to bring SQL to Cassandra?
I suggest we fork 2 discuss threads to have a focused discussion on each topic. Thanks, Dinesh On Wed, Nov 5, 2025 at 10:29 AM David Capwell <[email protected]> wrote: > My personal stance is that new work should look at existing syntax and ask > the question “why are we different”, if the answer is “I prefer this” or “I > didn’t have the time”, I want to push back against this and argue for SQL / > Postgres w/e possible. If the answer is “correctness” or “performance” I > am far more open to do things our own way. > > Given the above, I don’t like having a requirement we must be SQL / > Postgres compliant, but I do think its a good guide post to keep in mind > when we are doing something new. > > I worry that we already struggle to implement the current surface area of > CQL correctly and in a way that scales safely. > > > This has been a big issue for me over the past few years, when we > implement features correctness / semantics have not historically been given > the thought I feel that they deserve; we have so many weird behaviors that > leak into user land (batch / CAS failures come to mind as they are > constantly making me sad… why is the “short” type variable length? WHY DO > WE HAVE MEANINGLESS EMPTYNESS!!!!); we have gotten much better over the > years though… not all negative here =) > > SQL has been building its surface area for decades and trying to catch up > is a significant effort and how to make things correct and performant > becomes an issue. In the latest spec there is now support for graph > queries, so signing up to be compatible means we need to implement the below > > SELECT * > FROM GRAPH_TABLE(my_graph > MATCH (a IS person)-[e IS friends]->(b IS person WHERE b.name = > 'Alice') > WHERE a.name = 'Mary' > COLUMNS (a.name AS person_a, b.name AS person_b) > ); > > That above example is is just a simple example, it gets far more complex > and would be harder for C* to support. > > > I would be curious to see a gap analysis between CQL and SQL that include > the differences in behaviors. I suspect that it will bring a few surprises > and provide some more solid foundation to this discussion. > > > I think this is a good starting point. There are some nice things in SQL > missing in C* that could be implemented without a ton of risk, and opening > up the discussion around these areas makes sense to me. > > Off the top of my head, here are basic queries that work in SQL but not > CQL, and there is very low levels of risk to support. > > SELECT 1 — simple query to test if the connection is still live > > SELECT func(42) FROM system.peers; — this has lead someone I know to have > to implement functions that return constants specifically to work around > this limitation… > > > > On Nov 5, 2025, at 9:15 AM, Jeff Jirsa <[email protected]> wrote: > > CQL just to demonstrate it’s possible > > Fat node style would indeed be faster but im mostly proving that its > functional > > On Nov 5, 2025, at 8:55 AM, Joseph Lynch <[email protected]> wrote: > > > I very much like Jeff, Josh et al.'s proposals around the pluggable > stateless API layer. Also I agree with Chris I would prefer a simpler API > not a more complex one for our applications to couple to e.g. the Java > stdlib. This also sets up a really nice path where the community members > can build the layers that make sense first out-of-tree, and as a project we > can choose the successful ones to bring in-tree. Whichever API those layers > couple to would be a new semi-public interface though which has to be > weighed. > > Jeff I am curious, in that prototype you are hacking are you interacting > directly with the internode protocol and verb system or going through CQL? > I imagine there could be some strengths to going straight to the internode? > > -Joey > > On Tue, Nov 4, 2025 at 3:49 PM Josh McKenzie <[email protected]> wrote: > >> Again from >> >> Right. I'm just zooming out a bit more and applying that same logical >> pattern broadly to other API language domains, not just SQL. But yes - your >> point definitely stands. >> >> On Tue, Nov 4, 2025, at 6:42 PM, Patrick McFadin wrote: >> >> I’m grooving on what “Cloud Native Jeff” is saying here and I would like >> to see where this could go. If we use a well established library like >> Calcite, then there is no API to maintain. We might find parts of Cassandra >> along the way we could alter to make it easier to integrate, but so far >> that’s just a premature optimization. >> >> Suuuuper interested to see the TPC-C when you have it, Jeff. >> >> > On Nov 4, 2025, at 3:25 PM, Jeff Jirsa <[email protected]> wrote: >> > >> > >> > >> > On 2025/11/04 22:32:08 Josh McKenzie wrote: >> >> >> >> So I guess what I'm noodling on here is a superset of what Patrick is >> w/a slight modification, where we double down on CQL as being the "low >> level high performance" API for C*, and have SQL and other APIs built on >> top of that. >> >> >> > >> > Again from >> https://lists.apache.org/thread/hdwf0g7pnnko7m84yxn87lybnlcdvn50 >> > >> >> Or is it building a native SQL implementation stateless on top of a >> backing ordered (ByteOrderedPartitioner), transactional (accord), key-value >> cassandra cluster ? It’s an extra hop, but trying to adjust the existing >> grammar / DDL to fit into a language it always mimicked but never >> implemented faithfully feels like a bumpy road, where there are many >> successful existence proofs for building it stateless a layer above. >> > >> > TiKV / TiDB, FoundationDB, etc, etc, etc. >> > >> > If you have a transactional, performant, ordered KV store, you can >> built almost any high level database on top of it. You can expose even >> lower layer primitives (like placement) to optimize for it. >> >> >> >> >
