Just responding to one point among many good ones, below ...

On 11/5/2025 10:29 AM, David Capwell wrote:
SQL has been building its surface area for decades and trying to catch up is a significant effort and how to make things correct and performant becomes an issue.  In the latest spec there is now support for graph queries, so signing up to be compatible means we need to implement the below

SELECT *
FROM GRAPH_TABLE(my_graph
    MATCH (a IS person)-[e IS friends]->(b IS person WHERE b.name = 'Alice')
    WHERE a.name = 'Mary'
    COLUMNS (a.name AS person_a, b.name AS person_b)
);

It depends on your definition of compatible. One definition, or litmus test, could be that any application written against Postgresql can be pointed at Cassandra and (modulo swapping drivers, endpoints, config) "just work". I.e., Cassandra wouldn't be considered compatible until your example query and others just work on Cassandra.

Another definition is that any application written against CQL can be rewritten against Cassandra SQL (no extra surface area), and then can be pointed at a Postgres instance and "just work" ... though performance and scaling characteristics might be different.

The second definition is reasonable and an easier bar to clear.

Thanks -- Joel.


That above example is is just a simple example, it gets far more complex and would be harder for C* to support.


I would be curious to see a gap analysis between CQL and SQL that include the differences in behaviors. I suspect that it will bring a few surprises and provide some more solid foundation to this discussion.

I think this is a good starting point.  There are some nice things in SQL missing in C* that could be implemented without a ton of risk, and opening up the discussion around these areas makes sense to me.

Off the top of my head, here are basic queries that work in SQL but not CQL, and there is very low levels of risk to support.

SELECT 1 — simple query to test if the connection is still live

SELECT func(42) FROM system.peers; — this has lead someone I know to have to implement functions that return constants specifically to work around this limitation…



On Nov 5, 2025, at 9:15 AM, Jeff Jirsa <[email protected]> wrote:

CQL just to demonstrate it’s possible

Fat node style would indeed be faster but im mostly proving that its functional

On Nov 5, 2025, at 8:55 AM, Joseph Lynch <[email protected]> wrote:


I very much like Jeff, Josh et al.'s proposals around the pluggable stateless API layer. Also I agree with Chris I would prefer a simpler API not a more complex one for our applications to couple to e.g. the Java stdlib. This also sets up a really nice path where the community members can build the layers that make sense first out-of-tree, and as a project we can choose the successful ones to bring in-tree. Whichever API those layers couple to would be a new semi-public interface though which has to be weighed.

Jeff I am curious, in that prototype you are hacking are you interacting directly with the internode protocol and verb system or going through CQL? I imagine there could be some strengths to going straight to the internode?

-Joey

On Tue, Nov 4, 2025 at 3:49 PM Josh McKenzie <[email protected]> wrote:

    Again from
    Right. I'm just zooming out a bit more and applying that same
    logical pattern broadly to other API language domains, not just
    SQL. But yes - your point definitely stands.

    On Tue, Nov 4, 2025, at 6:42 PM, Patrick McFadin wrote:
    I’m grooving on what “Cloud Native Jeff” is saying here and I
    would like to see where this could go. If we use a well
    established library like Calcite, then there is no API to
    maintain. We might find parts of Cassandra along the way we
    could alter to make it easier to integrate, but so far that’s
    just a premature optimization.

    Suuuuper interested to see the TPC-C when you have it, Jeff.

    > On Nov 4, 2025, at 3:25 PM, Jeff Jirsa <[email protected]> wrote:
    >
    >
    >
    > On 2025/11/04 22:32:08 Josh McKenzie wrote:
    >>
    >> So I guess what I'm noodling on here is a superset of what
    Patrick is w/a slight modification, where we double down on CQL
    as being the "low level high performance" API for C*, and have
    SQL and other APIs built on top of that.
    >>
    >
    > Again from
    https://lists.apache.org/thread/hdwf0g7pnnko7m84yxn87lybnlcdvn50
    >
    >> Or is it building a native SQL implementation stateless on
    top of a backing ordered (ByteOrderedPartitioner),
    transactional (accord), key-value cassandra cluster ? It’s an
    extra hop, but trying to adjust the existing grammar / DDL to
    fit into a language it always mimicked but never implemented
    faithfully feels like a bumpy road, where there are many
    successful existence proofs for building it stateless a layer
    above.
    >
    > TiKV / TiDB, FoundationDB, etc, etc, etc.
    >
    > If you have a transactional, performant, ordered KV store,
    you can built almost any high level database on top of it. You
    can expose even lower layer primitives (like placement) to
    optimize for it.



Reply via email to