[
https://forge.continuent.org/jira/browse/SEQUOIA-1034?page=comments#action_14552
]
Robert Hodges commented on SEQUOIA-1034:
----------------------------------------
This comment contains a suggested algorithm to implement fully deterministic
execution of SQL requests while avoiding problems with distributed deadlock
described above.
GOAL
The goal of this algorithm is to ensure deterministic execution of a stream of
SQL requests in the presence of transactions. We operate on the following
assumptions:
1.) Applications that generate SQL are free from deadlock when they run
directly against the database.
2.) Requests are delivered to this algorithm in a deterministic total order.
3.) Database locks are granted deterministically. That is to say, if
transactions A, B, and C lock a particular object in the database in that
order, the database will grant the locks in a deterministic order that is
invariant across database instances.
(I confirmed #3 on PostgreSQL (email to Tom Lane) as well as on MySQL/InnoDB
(Seppo Jaakola). This condition does not hold for databases like Sybase that
use page locking, in which case locks can vary arbitrary based on physical page
locations of row data and indexes.
PROBLEMS TO AVOID
The algorithm needs to avoid the following problems:
1.) Middleware deadlocks. Locking in middleware should be deadlock free.
2.) Distributed deadlocks. These are described below as well as elsewhere in
this JIRA.
3.) Race conditions. Race conditions for SQL execution can result in subtle
issues with non-determinism.
In general, preventing non-determinism is most difficult with statements that
involve access to "non-transactional" resources. Here are two common examples
where race conditions cause serious problems.
1.) Sequences. Sequence numbers are assigned in the order that clients ask for
them. This means that sequences are vulnerable to any re-ordering of
statements. Here's a sample that is vulnerable to a couple of different race
conditions:
UPDATE foo SET bar=nextval('my_sequence') WHERE id=25
2.) Sub-selects. Sub-selects on INSERT, UPDATE, and DELETE statements typically
get a new snapshot of data each time they execute (READ COMMITTED behavior).
This means that the results of such statements are vulnerable to the order of
COMMIT statements.
ALGORITHM
So now here is the algorithm. Assume a queue of totally ordered SQL requests.
Each SQL request is read from the queue and acquires locks on all tables
potentially affected by the SQL statement contained in the request. Locks are
acquired in a fixed order (by sorting) to ensure there are no internal
deadlocks. Once each SQL statement acquires needed locks it may be scheduled
to execute.
The SQL statement releases its locks under either of the following conditions.
1.) The SQL request completes execution
2.) The SQL request is blocked and awaiting locks in the database. To
determine this we must execute a procedure to seek blocked requests. Most
databases provide such a mechanism. Note that timeouts are insufficient to
determine if SQL is blocked since timeouts do not distinguish between blocked
SQL statements from those that are merely slow, for example a large update.
The locks required differs by SQL statements. Standard types of locks are
shown below.
1.) BEGIN. No locks required. This statement may execute immediately.
2.) SELECT FOR UPDATE. Requires a lock on the selected table.
3.) INSERT. Requires a lock on the table into which SQL is inserted.
4.) UPDATE. Requires a lock on the table being update as well as any table on
which the SQL statement may acquire shared locks (e.g., due to foreign keys) as
well as any table in a sub-select.
5.) DELETE. Same rules as for UPDATE.
6.) COMMIT. Requires locks on all tables touched by the transaction, i.e., any
table previously locked by the transaction.
Here are three examples of typical SQL scenarios and how they work with this
algorithm.
SCENARIO 1: Non-transactional execution
In this scenario, a set of auto-commit transactions updates a table A, row 1
and table B, row 1. S1,2,3 denotes different client sessions.
S1: UPDATE(A1)
S2: UPDATE(A1)
S1: UPDATE(A1)
S2: UPDATE(B1)
Updates on A proceed in total order. The update on B may proceed in parallel.
SCENARIO 2: Distributed deadlock
T1, T2: BEGIN
T1: UPDATE(A1) <-- Acquires DB row lock
T2: UPDATE(A1) <-- Acquires middleware table lock, blocks in database
awaiting row lock
T1: COMMIT
T2: COMMIT
The lock acquired by T2 is freed once the update blocks in the database. If
you add more requests contending for the same row the freeing operation can
result in reordering of requests as shown below:
T1, T2: BEGIN
T1: UPDATE(A1) <-- Acquires DB row lock
T2: UPDATE(A1) <-- Acquires middleware table lock, blocks in database
awaiting row lock
T1: UPDATE(A2)
T1: COMMIT
T2: COMMIT
This is reordered to the following sequence. The reordering, however, is
deterministic.
T1, T2: BEGIN
T1: UPDATE(A1) <-- Acquires DB row lock
T1: UPDATE(A2)
T1: COMMIT
T2: UPDATE(A1) <-- Acquires middleware table lock, blocks in database
awaiting row lock
T2: COMMIT
SCENARIO 3: Distributed deadlock plus commit race condition
In cases where transactions are blocked in the database, race conditions may
arise due to the order of commits, which release locks and permit blocked
requests to resume.
T1,T2,T3: BEGIN
T1: UPDATE(A1) <-- Grabs row level lock
T2: UPDATE(A1) <-- Blocks on row level
T3: UPDATE(A2) <-- Race starts here if T1: COMMIT runs concurrently
T1: COMMIT
T2, T3: COMMIT
The proposed algorithm avoids COMMIT race conditions by obliging the COMMIT to
acquire locks on all tables acquired during the transaction. This will order
execution after T3: UPDATE(A2).
PROBLEMS
1. Race conditions are still possible if there are multiple statements
contending for locks on the same rows. You can get a case where some
statements are being ordered by the database whereas others further up the
total order queue are still ordered by middleware locks. This leads to a
situation similar to the race condition between contents of the conflicting and
non-conflicting queues. It requires more investigation as this appears to
result in non-deterministic reordering of statements.
2. Breaking middleware locks. Detected locked requests in MySQL/InnoDb and
PostgreSQL is possible but not fast. Detecting a blocked transaction may not
be particularly quick. For example, if the controller host has a lot of load
on the CPU tests to detect locks in the database may not run in a timely
fashion, which will result in pauses in processing. This problem can be
mitigated by avoiding timer-based algorithms to look for locking problems.
> Possible race condition leading to concurrent requests being executed in a
> different order on backends
> ------------------------------------------------------------------------------------------------------
>
> Key: SEQUOIA-1034
> URL: https://forge.continuent.org/jira/browse/SEQUOIA-1034
> Project: Sequoia
> Type: Bug
> Components: Core
> Versions: Sequoia 2.10.9
> Reporter: Stephane Giron
> Assignee: Stephane Giron
> Priority: Critical
> Fix For: sequoia 2.10.11
>
>
> This can lead to backends inconsistencies.
> As an example, this is the behavior that was observed :
> CREATE TABLE `table1` (
> `db_id` bigint(20) NOT NULL auto_increment,
> `col1` varchar(255) default NULL,
> `creation_date` datetime default NULL,
> `request_date` varchar(255) default NULL,
> `col2` varchar(255) default NULL,
> `version` varchar(255) default NULL,
> PRIMARY KEY (`db_id`),
> UNIQUE KEY `col1` (`col1`,`col2`)
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
> Transaction 1
> begin
> insert into table1 (version) values('a1');
> insert into table1 (version) values('c1');
> commit
> Transaction 2
> begin
> insert into table1 (version) values('b')
> commit
> T1 begin
> T2 begin
> T1 insert 1
> T2 insert 1
> T1 insert 2
> commits
> The following result is what I got in the backends :
> On one controller :
> | 8 | NULL | NULL | NULL | NULL | a1 |
> | 9 | NULL | NULL | NULL | NULL | c1 |
> | 10 | NULL | NULL | NULL | NULL | b |
> On the other :
> | 8 | NULL | NULL | NULL | NULL | a1 |
> | 9 | NULL | NULL | NULL | NULL | b |
> | 10 | NULL | NULL | NULL | NULL | c1 |
> This can happen if the statement "T2 insert 1", which will be posted in the
> conflicting queue, can be processed on one controller (because there is
> nothing else at this time in the non conflicting queue), whereas on the other
> controller the "T1 insert 2" statement arrives in the non conflicting queue
> before "T2 insert 1" starts executing.
> This behavior can be observed essentially because of the auto incremented
> key. However other type of statements may run out of order as well.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://forge.continuent.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia