[ 
https://issues.apache.org/jira/browse/IGNITE-28192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18070478#comment-18070478
 ] 

Anton Laletin commented on IGNITE-28192:
----------------------------------------

We implemented the following algorithm to achieve transaction serializability:
https://www.cs.utexas.edu/~dsb/cs386d/Readings/ConcurrencyControl/Aries-KVL.PDF

However, the following scenario can occur.

Let’s assume the storage contains keys 101, 102, 105.

A scan reaches 102.

Two concurrent inserts add keys 103 and 104. Both take IX locks on 105. Since 
IX locks are compatible, they do not block each other and proceed in parallel.

The insert of 104 completes first and writes its value to storage.

At this moment, the scan advances and acquires a lock on 104. Meanwhile, the 
insert of 103 is still in progress. For 103, the “next key” remains 105, 
because it did not conflict with 104 (they executed concurrently without 
blocking each other).

Then 103 is written to storage and releases its lock on 105.

The scan proceeds to 105, effectively skipping 103.

A subsequent scan will, of course, observe 103.

If we used an X lock on the next key in the insert transaction, this issue 
would not occur. However, this would effectively correspond to the System R 
approach.

To correctly implement ARIES/KVL, we also need to take locks (or latches) on 
pages, specifically on the current and the next page. This is what allows 
concurrent transactions to behave correctly.

These must be short-lived locks, held only for the duration of the operation. 
This prevents unnecessary blocking while also avoiding the scenario described 
above.

It is important to note that ARIES/KVL provides higher concurrency than the 
System R approach due to the use of short-duration locks. We would prefer not 
to adopt the System R approach, as it would reduce performance.

However, we currently do not have the concept of pages in our storage, so it is 
not feasible to implement ARIES/KVL as described.

We need to decide on the appropriate direction going forward.

> Phantom reads in RW scans
> -------------------------
>
>                 Key: IGNITE-28192
>                 URL: https://issues.apache.org/jira/browse/IGNITE-28192
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Ivan Bessonov
>            Assignee: Anton Laletin
>            Priority: Major
>              Labels: ignite-3
>         Attachments: IGNITE-28192.patch
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {{ItTableScanTest#testPhantomReads}} is flaky, with a rate of about 1% 
> locally. I'm providing no TC link because I didn't check its state. Here I 
> will describe how it happens:
>  * Initially we have following rows in an index: {{10, 40}}.
>  * Read-Write scan is started. It acquires a lock to {{10}}.
>  * We do a {{"PUT 30"}} operation, let's call it {{TX_A}}.
>  ** It successfully acquires a "next" side of a gap lock first. it's 
> currently a {{40}}.
>  ** It successfully acquires a "current" side of a gap lock, which is {{30}}. 
> The gap lock is {{[40, 30]}}
>  ** There's no data insertion yet.
>  * Another thread does a {{"PUT 20"}} operation, let's call it {{TX_B}}.
>  ** It reads the "next" side of a gap lock candidate from the index. It's 
> {{40}}.
>  * {{TX_A}} finishes.
>  ** {{30}} is inserted into the index.
>  ** Gap lock {{[40, 30]}} is released.
>  * Scan acquires a lock {{30}}, which gives a gap lock {{[10, 30]}}.
>  * {{TX_B}} acquires a lock {{40}} that it previously read.
>  ** Then it acquires a lock for its own key {{20}}, forming a {{[40, 20]}} 
> gap lock. _It does NOT re-read "next" key and doesn't reacquire a proper 
> "next" side_ (which is {{30}}, not {{40}}).
> ** Then {{TX_B}} inserts {{20}} into the index and finishes, releasing a 
> {{[40, 20]}} gap lock.
> * Scan finishes with acquiring a {{[30, 40]}} gap lock. Data that it has read 
> is {{10, 30, 40}}. And {{20}} is inserted into index concurrently.
> * Second scan of the same index will return {{10, 20, 30, 40}}, giving us a 
> phantom read of {{20}}.
> UPDATE: I don't think that {{if (!rowIdMatches(peekedRow, 
> peekedRowAfterLock))}} is correct either, we may need to compare the payload. 
> Let's check the design document and use common sense here.
> UPDATE 2: Information here is not 100% accurate. Insertions do release 
> short-term locks before commit, this is not reflected in the scenario, but it 
> should not change the outcome.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to