I agree on splitting this up. I'll do that today. Patrick
On Wed, Nov 5, 2025 at 10:44 AM Dinesh Joshi <[email protected]> wrote: > There are two distinct conversations in this thread. > > 1. What does the evolution of CQL Syntax look like? > 2. What is the path to bring SQL to Cassandra? > > I suggest we fork 2 discuss threads to have a focused discussion on each > topic. > > Thanks, > > Dinesh > > On Wed, Nov 5, 2025 at 10:29 AM David Capwell <[email protected]> wrote: > >> My personal stance is that new work should look at existing syntax and >> ask the question “why are we different”, if the answer is “I prefer this” >> or “I didn’t have the time”, I want to push back against this and argue for >> SQL / Postgres w/e possible. If the answer is “correctness” or >> “performance” I am far more open to do things our own way. >> >> Given the above, I don’t like having a requirement we must be SQL / >> Postgres compliant, but I do think its a good guide post to keep in mind >> when we are doing something new. >> >> I worry that we already struggle to implement the current surface area of >> CQL correctly and in a way that scales safely. >> >> >> This has been a big issue for me over the past few years, when we >> implement features correctness / semantics have not historically been given >> the thought I feel that they deserve; we have so many weird behaviors that >> leak into user land (batch / CAS failures come to mind as they are >> constantly making me sad… why is the “short” type variable length? WHY DO >> WE HAVE MEANINGLESS EMPTYNESS!!!!); we have gotten much better over the >> years though… not all negative here =) >> >> SQL has been building its surface area for decades and trying to catch up >> is a significant effort and how to make things correct and performant >> becomes an issue. In the latest spec there is now support for graph >> queries, so signing up to be compatible means we need to implement the below >> >> SELECT * >> FROM GRAPH_TABLE(my_graph >> MATCH (a IS person)-[e IS friends]->(b IS person WHERE b.name = >> 'Alice') >> WHERE a.name = 'Mary' >> COLUMNS (a.name AS person_a, b.name AS person_b) >> ); >> >> That above example is is just a simple example, it gets far more complex >> and would be harder for C* to support. >> >> >> I would be curious to see a gap analysis between CQL and SQL that include >> the differences in behaviors. I suspect that it will bring a few surprises >> and provide some more solid foundation to this discussion. >> >> >> I think this is a good starting point. There are some nice things in SQL >> missing in C* that could be implemented without a ton of risk, and opening >> up the discussion around these areas makes sense to me. >> >> Off the top of my head, here are basic queries that work in SQL but not >> CQL, and there is very low levels of risk to support. >> >> SELECT 1 — simple query to test if the connection is still live >> >> SELECT func(42) FROM system.peers; — this has lead someone I know to have >> to implement functions that return constants specifically to work around >> this limitation… >> >> >> >> On Nov 5, 2025, at 9:15 AM, Jeff Jirsa <[email protected]> wrote: >> >> CQL just to demonstrate it’s possible >> >> Fat node style would indeed be faster but im mostly proving that its >> functional >> >> On Nov 5, 2025, at 8:55 AM, Joseph Lynch <[email protected]> wrote: >> >> >> I very much like Jeff, Josh et al.'s proposals around the pluggable >> stateless API layer. Also I agree with Chris I would prefer a simpler API >> not a more complex one for our applications to couple to e.g. the Java >> stdlib. This also sets up a really nice path where the community members >> can build the layers that make sense first out-of-tree, and as a project we >> can choose the successful ones to bring in-tree. Whichever API those layers >> couple to would be a new semi-public interface though which has to be >> weighed. >> >> Jeff I am curious, in that prototype you are hacking are you interacting >> directly with the internode protocol and verb system or going through CQL? >> I imagine there could be some strengths to going straight to the internode? >> >> -Joey >> >> On Tue, Nov 4, 2025 at 3:49 PM Josh McKenzie <[email protected]> >> wrote: >> >>> Again from >>> >>> Right. I'm just zooming out a bit more and applying that same logical >>> pattern broadly to other API language domains, not just SQL. But yes - your >>> point definitely stands. >>> >>> On Tue, Nov 4, 2025, at 6:42 PM, Patrick McFadin wrote: >>> >>> I’m grooving on what “Cloud Native Jeff” is saying here and I would like >>> to see where this could go. If we use a well established library like >>> Calcite, then there is no API to maintain. We might find parts of Cassandra >>> along the way we could alter to make it easier to integrate, but so far >>> that’s just a premature optimization. >>> >>> Suuuuper interested to see the TPC-C when you have it, Jeff. >>> >>> > On Nov 4, 2025, at 3:25 PM, Jeff Jirsa <[email protected]> wrote: >>> > >>> > >>> > >>> > On 2025/11/04 22:32:08 Josh McKenzie wrote: >>> >> >>> >> So I guess what I'm noodling on here is a superset of what Patrick is >>> w/a slight modification, where we double down on CQL as being the "low >>> level high performance" API for C*, and have SQL and other APIs built on >>> top of that. >>> >> >>> > >>> > Again from >>> https://lists.apache.org/thread/hdwf0g7pnnko7m84yxn87lybnlcdvn50 >>> > >>> >> Or is it building a native SQL implementation stateless on top of a >>> backing ordered (ByteOrderedPartitioner), transactional (accord), key-value >>> cassandra cluster ? It’s an extra hop, but trying to adjust the existing >>> grammar / DDL to fit into a language it always mimicked but never >>> implemented faithfully feels like a bumpy road, where there are many >>> successful existence proofs for building it stateless a layer above. >>> > >>> > TiKV / TiDB, FoundationDB, etc, etc, etc. >>> > >>> > If you have a transactional, performant, ordered KV store, you can >>> built almost any high level database on top of it. You can expose even >>> lower layer primitives (like placement) to optimize for it. >>> >>> >>> >>> >>
