Just responding to one point among many good ones, below ...
On 11/5/2025 10:29 AM, David Capwell wrote:
SQL has been building its surface area for decades and trying to catch
up is a significant effort and how to make things correct and
performant becomes an issue. In the latest spec there is now support
for graph queries, so signing up to be compatible means we need to
implement the below
SELECT *
FROM GRAPH_TABLE(my_graph
MATCH (a IS person)-[e IS friends]->(b IS person WHERE b.name =
'Alice')
WHERE a.name = 'Mary'
COLUMNS (a.name AS person_a, b.name AS person_b)
);
It depends on your definition of compatible. One definition, or litmus
test, could be that any application written against Postgresql can be
pointed at Cassandra and (modulo swapping drivers, endpoints, config)
"just work". I.e., Cassandra wouldn't be considered compatible until
your example query and others just work on Cassandra.
Another definition is that any application written against CQL can be
rewritten against Cassandra SQL (no extra surface area), and then can be
pointed at a Postgres instance and "just work" ... though performance
and scaling characteristics might be different.
The second definition is reasonable and an easier bar to clear.
Thanks -- Joel.
That above example is is just a simple example, it gets far more
complex and would be harder for C* to support.
I would be curious to see a gap analysis between CQL and SQL
that include the differences in behaviors. I suspect that it will
bring a few surprises and provide some more solid foundation to this
discussion.
I think this is a good starting point. There are some nice things in
SQL missing in C* that could be implemented without a ton of risk, and
opening up the discussion around these areas makes sense to me.
Off the top of my head, here are basic queries that work in SQL but
not CQL, and there is very low levels of risk to support.
SELECT 1 — simple query to test if the connection is still live
SELECT func(42) FROM system.peers; — this has lead someone I know to
have to implement functions that return constants specifically to work
around this limitation…
On Nov 5, 2025, at 9:15 AM, Jeff Jirsa <[email protected]> wrote:
CQL just to demonstrate it’s possible
Fat node style would indeed be faster but im mostly proving that its
functional
On Nov 5, 2025, at 8:55 AM, Joseph Lynch <[email protected]> wrote:
I very much like Jeff, Josh et al.'s proposals around the pluggable
stateless API layer. Also I agree with Chris I would prefer a
simpler API not a more complex one for our applications to couple to
e.g. the Java stdlib. This also sets up a really nice path
where the community members can build the layers that make sense
first out-of-tree, and as a project we can choose the successful
ones to bring in-tree. Whichever API those layers couple to would be
a new semi-public interface though which has to be weighed.
Jeff I am curious, in that prototype you are hacking are you
interacting directly with the internode protocol and verb system or
going through CQL? I imagine there could be some strengths to going
straight to the internode?
-Joey
On Tue, Nov 4, 2025 at 3:49 PM Josh McKenzie <[email protected]>
wrote:
Again from
Right. I'm just zooming out a bit more and applying that same
logical pattern broadly to other API language domains, not just
SQL. But yes - your point definitely stands.
On Tue, Nov 4, 2025, at 6:42 PM, Patrick McFadin wrote:
I’m grooving on what “Cloud Native Jeff” is saying here and I
would like to see where this could go. If we use a well
established library like Calcite, then there is no API to
maintain. We might find parts of Cassandra along the way we
could alter to make it easier to integrate, but so far that’s
just a premature optimization.
Suuuuper interested to see the TPC-C when you have it, Jeff.
> On Nov 4, 2025, at 3:25 PM, Jeff Jirsa <[email protected]> wrote:
>
>
>
> On 2025/11/04 22:32:08 Josh McKenzie wrote:
>>
>> So I guess what I'm noodling on here is a superset of what
Patrick is w/a slight modification, where we double down on CQL
as being the "low level high performance" API for C*, and have
SQL and other APIs built on top of that.
>>
>
> Again from
https://lists.apache.org/thread/hdwf0g7pnnko7m84yxn87lybnlcdvn50
>
>> Or is it building a native SQL implementation stateless on
top of a backing ordered (ByteOrderedPartitioner),
transactional (accord), key-value cassandra cluster ? It’s an
extra hop, but trying to adjust the existing grammar / DDL to
fit into a language it always mimicked but never implemented
faithfully feels like a bumpy road, where there are many
successful existence proofs for building it stateless a layer
above.
>
> TiKV / TiDB, FoundationDB, etc, etc, etc.
>
> If you have a transactional, performant, ordered KV store,
you can built almost any high level database on top of it. You
can expose even lower layer primitives (like placement) to
optimize for it.