> I was thinking that a path similar to Calvin/FaunaDB is certainly looming in 
> the horizon at least.

I’m not sure which aspect of these systems you are referring to. Unless I have 
misunderstood, I consider them to be strictly inferior approaches (particularly 
for Cassandra) as they require a _global_ leader process and as a result have 
scalability limits. Users simply shift the sharding problem to the cluster 
level rather than the node level, but the fundamental problem remains. This may 
be acceptable for many users, but was contrary to the goals of this CEP.

> It seems to me at that point long running queries and interactive 
> transactions are mostly the same problem.

I would estimate long running queries to be easier to deliver by at least an 
order of magnitude. They’re not unrelated, but they’re still quite distinct in 
my opinion.

> good job pulling together ingredients from state of the art work in this area

In case this was lost in the noise: this work is not simply an assembly of 
prior work. It introduces entirely novel approaches that permit the work to 
exceed the capabilities of any prior research or production system. It is worth 
properly highlighting that if we deliver this, Cassandra will have the most 
sophisticated transaction system full stop.

There are to my knowledge no databases offering distributed transactions that 
are both strict serializable and have no scalability bottleneck. Every database 
today clearly aims for this combination, but accepts some trade-off: either 
only guaranteeing serializable isolation, requiring special time keeping 
hardware to guarantee strict serializability, or using a global leader process 
(or uses two phase commit, but this is quite niche).



From: Henrik Ingo <henrik.i...@datastax.com>
Date: Tuesday, 7 September 2021 at 14:06
To: dev@cassandra.apache.org <dev@cassandra.apache.org>
Subject: Re: [DISCUSS] CEP-15: General Purpose Transactions
On Tue, Sep 7, 2021 at 12:26 PM bened...@apache.org <bened...@apache.org>
wrote:

> > whether I should just* think of this as "better and more efficient LWT”
>
> So, the LWT concept is a Cassandra one and doesn’t have an agreed-upon
> definition. My understanding of a core feature/limitation of LWTs is that
> they operate over a single partition, and as a result many operations are
> impossible even in multiple rounds without complex distributed state
> machines. The core improvement here, besides improved performance, is that
> we will be able to operate over any set of keys at-once.
>
>
My bad, I have never used LWT and forgot / didn't know they were single
partition. The CEP makes more sense now.



> How this facility is evolved into user-facing capabilities is an
> open-ended question. Initially of course we will at least support the same
> syntax but remove the restriction on operating over a single partition. I
> haven’t thought about this much, as the CEP is primarily for enabling
> works, but I think we will want to expand the syntax in two ways:
>
>  1) to support more complex conditions (simple AND conditions across all
> partitions seem likely too restrictive, though they might make sense for
> the single partition case);
>   2) to support inserting data from one row into another, potentially with
> transformations being applied (including via UDFs).
>
> These are both relatively manageable improvements that we might want to
> land in the same major release as the transactions themselves. The core
> facility can be expanded quite broadly, though. It would be possible for
> instance to support some interpreted language(s) as part of a query, so
> that arbitrary work can be applied in the transaction.
>

I was thinking that a path similar to Calvin/FaunaDB is certainly looming
in the horizon at least. I've been following those with interest, because
a) it's refreshingly outside of the box thinking, and b) they seem to be
able to push the limitations of this approach much beyond what one might
imagine when reading about it the first time. But like you also point out,
it remains to be seen whether users actually want those kinds of
transactions. We are creatures of habit for sure.



> Or, perhaps the community would rather build atop the feature to support
> interactive transactions at the client. I can’t predict resourcing for
> this, though, and it might be a community effort. I think it would be quite
> tractable once this work lands, however.
>
> > Suppose I wanted to do a long running read-only transaction
>
> So, there’s two sides to this: with and without paging. A long running
> read-only transaction taking a few seconds is quite likely to be fine and
> we will probably support with some MVCC within the transaction system
> itself. This may or may not be part of v1, it’s hard to predict with
> certainty as this is going to be a large undertaking.
>
> But for paged queries we’d be talking about SNAPSHOT isolation. This is
> likely to be something the community wants to support before long anyway
> and is probably not as hard as you might think. It is probably outside of
> the scope of this work, though the two would dovetail very nicely.
>

I've pointed out to some of my colleagues that since Cassandra's storage
engine is an LSM engine, with some additional work it could become an MVCC
style storage engine. Your thinking here seems to be in the same direction,
even if it's beyond version 1. (Just for context, also for benefit of other
readers on the list, it took MongoDB 5 years and 6 major releases to
develop distributed multi-shard transactions. So it's good to talk about
the general direction, but understanding that this is not something anyone
will finish before Christmas.)

It seems to me at that point long running queries and interactive
transactions are mostly the same problem.

****

Benedict, thanks for the answers. Since I'm not a Cassandra developer I
feel it would be inappropriate for me to express an opinion for or against,
so I'll just end with saying this is an interesting proposal and the
authors have done a good job pulling together ingredients from state of the
art work in this area. As such it will be interesting to follow the
discussion and work from whitepaper to implementation.


A secondary objective was also to just let everyone know I am lurking here.
If you ever want to reach out for an off-band discussion, you now have my
contact details.

henrik

Reply via email to