Re: [DISCUSS] SQL support in Cassandra

Joel Shepherd Wed, 12 Nov 2025 09:40:26 -0800

Just responding to one point among many good ones, below ...


On 11/5/2025 10:29 AM, David Capwell wrote:

SQL has been building its surface area for decades and trying to catchup is a significant effort and how to make things correct andperformant becomes an issue. In the latest spec there is now supportfor graph queries, so signing up to be compatible means we need toimplement the below
SELECT *
FROM GRAPH_TABLE(my_graph
MATCH (a IS person)-[e IS friends]->(b IS person WHERE b.name ='Alice')
    WHERE a.name = 'Mary'
    COLUMNS (a.name AS person_a, b.name AS person_b)
);

It depends on your definition of compatible. One definition, or litmustest, could be that any application written against Postgresql can bepointed at Cassandra and (modulo swapping drivers, endpoints, config)"just work". I.e., Cassandra wouldn't be considered compatible untilyour example query and others just work on Cassandra.

Another definition is that any application written against CQL can berewritten against Cassandra SQL (no extra surface area), and then can bepointed at a Postgres instance and "just work" ... though performanceand scaling characteristics might be different.


The second definition is reasonable and an easier bar to clear.

Thanks -- Joel.

That above example is is just a simple example, it gets far morecomplex and would be harder for C* to support.
I would be curious to see a gap analysis between CQL and SQLthat include the differences in behaviors. I suspect that it willbring a few surprises and provide some more solid foundation to thisdiscussion.
I think this is a good starting point. There are some nice things inSQL missing in C* that could be implemented without a ton of risk, andopening up the discussion around these areas makes sense to me.
Off the top of my head, here are basic queries that work in SQL butnot CQL, and there is very low levels of risk to support.
SELECT 1 — simple query to test if the connection is still live
SELECT func(42) FROM system.peers; — this has lead someone I know tohave to implement functions that return constants specifically to workaround this limitation…
On Nov 5, 2025, at 9:15 AM, Jeff Jirsa <[email protected]> wrote:

CQL just to demonstrate it’s possible
Fat node style would indeed be faster but im mostly proving that itsfunctional
On Nov 5, 2025, at 8:55 AM, Joseph Lynch <[email protected]> wrote:
I very much like Jeff, Josh et al.'s proposals around the pluggablestateless API layer. Also I agree with Chris I would prefer asimpler API not a more complex one for our applications to couple toe.g. the Java stdlib. This also sets up a really nice pathwhere the community members can build the layers that make sensefirst out-of-tree, and as a project we can choose the successfulones to bring in-tree. Whichever API those layers couple to would bea new semi-public interface though which has to be weighed.
Jeff I am curious, in that prototype you are hacking are youinteracting directly with the internode protocol and verb system orgoing through CQL? I imagine there could be some strengths to goingstraight to the internode?
-Joey
On Tue, Nov 4, 2025 at 3:49 PM Josh McKenzie <[email protected]>wrote:
    Again from
    Right. I'm just zooming out a bit more and applying that same
    logical pattern broadly to other API language domains, not just
    SQL. But yes - your point definitely stands.

    On Tue, Nov 4, 2025, at 6:42 PM, Patrick McFadin wrote:
    I’m grooving on what “Cloud Native Jeff” is saying here and I
    would like to see where this could go. If we use a well
    established library like Calcite, then there is no API to
    maintain. We might find parts of Cassandra along the way we
    could alter to make it easier to integrate, but so far that’s
    just a premature optimization.

    Suuuuper interested to see the TPC-C when you have it, Jeff.

    > On Nov 4, 2025, at 3:25 PM, Jeff Jirsa <[email protected]> wrote:
    >
    >
    >
    > On 2025/11/04 22:32:08 Josh McKenzie wrote:
    >>
    >> So I guess what I'm noodling on here is a superset of what
    Patrick is w/a slight modification, where we double down on CQL
    as being the "low level high performance" API for C*, and have
    SQL and other APIs built on top of that.
    >>
    >
    > Again from
    https://lists.apache.org/thread/hdwf0g7pnnko7m84yxn87lybnlcdvn50
    >
    >> Or is it building a native SQL implementation stateless on
    top of a backing ordered (ByteOrderedPartitioner),
    transactional (accord), key-value cassandra cluster ? It’s an
    extra hop, but trying to adjust the existing grammar / DDL to
    fit into a language it always mimicked but never implemented
    faithfully feels like a bumpy road, where there are many
    successful existence proofs for building it stateless a layer
    above.
    >
    > TiKV / TiDB, FoundationDB, etc, etc, etc.
    >
    > If you have a transactional, performant, ordered KV store,
    you can built almost any high level database on top of it. You
    can expose even lower layer primitives (like placement) to
    optimize for it.

Re: [DISCUSS] SQL support in Cassandra

Reply via email to