Re: [IDEA] Read committed transaction with Accord

bened...@apache.org Wed, 13 Oct 2021 05:38:06 -0700

> It seems like potentially every statement now needs to go through the Accord 
> consensus
protocol, and this could become expensive, where my goal was to design the
simplest and most lightweight example thinkable. BUT for read-only Accord
transactions, where I specifically also don't care about serializability,
wouldn't this be precisely the case where I can simply pick my own
timestamp and do a stale read from a  nearby replica?


It may have been lost in the back and forth (there’s been a lot of emails), but 
I outlined an approach for READ COMMITTED (and SERIALIZABLE) read isolation 
that does not require a WAN trip. However, for any multi-shard protocol it is 
unsafe to perform an uncoordinated read as each shard may have applied 
different transaction state – the timestamp you pick may be in the future on 
some shards.

So, initially, a sequence of ordinary Accord transactions offer READ COMMITTED 
isolation. To improve performance further it will be possible to offer stale 
reads from the local DC that meet the isolation level.

For single shard operations, picking a timestamp known to the shard is 
perfectly safe.

> For my purposes I'll just note that needing to re-execute all reads during 
> the Accord phase (commit phase) would make the design more expensive

As noted by Alex, the only thing that needs to be corroborated in this approach 
is the timestamps. However, this is only necessary for SERIALIZABLE isolation 
or above. For READ COMMITTED it would be enough to perform reads with the above 
properties, and to buffer writes until a final Accord transaction round. 
However, as also noted by Alex, the difficulty here is read-your-writes. This 
is a general problem for interactive transactions and – like many of these 
properties - orthogonal to Accord. A simple approach would be to nominate a 
coordinator for the transaction that buffers writes and integrates them into 
the transaction execution. This might impose some restrictions on the size of 
the transaction we want to support, and of course means if the coordinator 
fails the transaction also fails.

If we want to remove all restrictions, we are back at the monoculture of 
Cockroach, YugaByte et al. Which, again, may both be implemented by, and 
co-exist with, Accord.

In this world, complex transactions would insert read and write intents using 
an Accord operation, along with a transaction state record. If transactions 
insert conflicting intents, the concurrency control arbitration mechanism 
decides what happens (whether one transaction blocks, aborts, or what have 
you). There is a bunch of literature on this that is orthogonal to the Accord 
discussion.

In the case of READ COMMITTED, I believe this can be particularly simple – we 
don’t need any read intents, only write intents which are essentially a 
distributed write buffer. The storage system uses these to answer reads from 
the transaction that inserted the write intents, but they are ignored for all 
other transactions. In this case, there is no arbitration needed as there is 
never a need for one transaction to prevent another’s progress. A final Accord 
operation commits these write intents atomically across all shards, so that 
reads that occur after this operation integrate these writes, and those that 
executed before do not.

Note that in this world, one-shot transactions may still execute without 
participating in the complex transaction system. In the READ COMMITTED world 
this is particularly simple, they may simply execute immediately using normal 
Accord operations. But it remains true even for SERIALIZABLE isolation, so long 
as there are no read or write intents to these keys. In this case we must 
validate there are no such intents, and if any are found we may need to upgrade 
to a complex transaction. This can be done atomically as part of the Accord 
operation, and then the transaction concurrency control arbitration mechanism 
kicks in.

Re: [IDEA] Read committed transaction with Accord

Reply via email to