> In the context of Cassandra, I had actually assumed the Accord timestamp will 
> be used as the cell timestamp for each value? Isn't something like this 
> needed for compaction to work correctly too?

Yes, though we are likely to apply some kind of compression to the timestamp, 
as global timestamps may not fit in a single long and I would prefer not to 
burden the storage system with that complexity. So, probably, when multiple 
transactions are agreed with the same wall clock but different global 
timestamps we are likely to increment the timestamp that is applied to the 
local node. That is to say, the storage timestamp will be derived from the 
transaction timestamp and the transaction timestamps of its dependencies. In 
reality this will come into play very rarely, of course.

I think this is a blurring of lines of systems however. I _think_ the point 
Alex is making (correct me if I’m wrong) is that the transaction system will 
need to track the transaction timestamps that were witnessed by each read for 
each key, in order to verify that they remain valid on commit. These might both 
be fetched from the storage system on each round (or might be from Accord’s 
non-interactive transaction bookkeeping), but the _interactive_ transaction 
bookkeeping will need to maintain these values separately as part of the 
interactive transaction state (perhaps on the client).

> Alternatively … some backpressure mechanism seems necessary to throttle new
transactions while previously committed ones are still being applied

Yes, this is something I envisage being desirable even without complex 
transactions to prevent DOS problems. We likely want to prevent new 
transactions from being started if the dependency set they would adopt is too 
large, and I think this is relatively straightforward.


From: Henrik Ingo <henrik.i...@datastax.com>
Date: Wednesday, 13 October 2021 at 11:25
To: dev@cassandra.apache.org <dev@cassandra.apache.org>
Subject: Re: [DISCUSS] CEP-15: General Purpose Transactions
On Wed, Oct 13, 2021 at 12:25 AM Alex Miller <millerde...@gmail.com> wrote:

> I have, purely out of laziness, been engaging on this topic on ASF Slack as
> opposed to dev@[1].  Benedict has been overly generous in answering
> questions and considering future optimizations there, but it means that I
> inadvertently forked the conversation on this topic.  To bring the
> highlights of that conversation back to the dev list:
>
>
Thanks for contributing to the discussion Alex! Your points and experience
seem rather valuable.




> == Interactive Transactions
>
>
Heh, it seems we sent these almost concurrently :-) Thanks for contributing
this. I think for many readers debating concrete examples is easier, even
if we are talking about future opportunities that's not in scope for the
CEP. It helps to see a path forward.


We also had a bit of discussion over implementation constraints on the
> conflict checking.  Without supporting optimistic transactions, Accord only
> needs to keep track of the read/write sets of transactions which are still
> in flight.  To support optimistic transactions, Accord would need to
> bookkeep the most recent timestamp at which the key was modified, for every
> key.  There's some databases (e.g. CockroachDB, FoundationDB) which have a
> similar need, and use similar data structures which could be copied.
>
>
In the context of Cassandra, I had actually assumed the Accord timestamp
will be used as the cell timestamp for each value? Isn't something like
this needed for compaction to work correctly too?

Committing a transaction before execution means the database is committed
> to performing the deferred work of transaction execution.  In some fashion,
> the expressiveness and complexity of the query language needs to be
> constrained to place limitations on the execution time or resources. Fauna
> invented FQL with a specific set of limitations for a presumable reason.
> CQL seems to already be a reasonably limited query language that doesn't
> easily lend itself to succinctly expressing an incredulous amount of work,
> which would make it already reasonably suited as a query language for
> Accord.
>
>
Alternatively - in a future where the query language evolves to be more
complex - some backpressure mechanism seems necessary to throttle new
transactions while previously committed ones are still being applied. (For
those of you that started reading up on Galera from my previous email, see
"flow control")




> Any query which can't pre-declare its read and write sets must attempt to
> pre-execute enough of the query to determine them, and then submit the
> transaction as optimistic on all values read during the partial execution
> still being untouched.  Most notably, all workloads that utilize secondary
> indexes are affected, and degrade from being guaranteed to commit, to being
> optimistic and potentially requiring retries.  This transformed Calvin into
> an optimistic protocol, and one that's significantly less efficient than
> classic execute-then-commit designs.  Accord is similarly affected, though
> the window of optimism would likely be smaller.  However, it seems like
> most common ways to end up in this situation are already discouraged or
> prevented.  CQL's own restrictions prevent many forms of queries which
> result in unclear read and write sets.  In my highly limited Cassandra
> experience, I've generally seen Secondary Indexes be cautioned against
> already.
>
>
See CEP-7 which independently is proposing a new set of secondary indexes
that we hope to be usable.

Rather than needing to re-execute anything, in my head I had thought that
for Accord to support secondary indexes, the write set is extended to also
cover the secondary index keys read or modified. Essentially this is like
thinking of a secondary index as its own primary key. Mutations that change
indexed columns, would add both their PK to the write set, as well as the
secondary index keys it modified. A read query would then check its
dependencies against whatever indexes (PK, or secondary) it uses to execute
itself, and nothing more.

The above is saying that for a given snapshot/timestamp, the result of a
statement is equally well defined by the secondary index keys used as it is
by the primary keys returned from those secondary index keys.

henrik
--

Henrik Ingo

+358 40 569 7354 <358405697354>

[image: Visit us online.] <https://www.datastax.com/>  [image: Visit us on
Twitter.] <https://twitter.com/DataStaxEng>  [image: Visit us on YouTube.]
<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.youtube.com_channel_UCqA6zOSMpQ55vvguq4Y0jAg&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=bmIfaie9O3fWJAu6lESvWj3HajV4VFwgwgVuKmxKZmE&s=16sY48_kvIb7sRQORknZrr3V8iLTfemFKbMVNZhdwgw&e=>
  [image: Visit my LinkedIn profile.] <https://www.linkedin.com/in/heingo/>

Reply via email to