Ivan Bessonov created IGNITE-20072:
--------------------------------------
Summary: Reduce waiting time in partition listener
Key: IGNITE-20072
URL: https://issues.apache.org/jira/browse/IGNITE-20072
Project: Ignite
Issue Type: Improvement
Reporter: Ivan Bessonov
h3. The problem
for every row that we update, we must take a lock for the rowId. This
guarantees consistency of data and indexes. But, there are cons:
* we must allocate locks, as well as a collection for locks;
* we may wait for some time while GC or another background process does its
thing.
Ideally, no locks should be used. We must investigate the issue and reduce
waiting time or eliminate it completely. Following is the short explanation of
what's going now right now, and some ideas about possible fixes.
h3. Full state transfer (snapshot) lock
{{OutgoingSnapshot#acquireMvLock}}
This locks is required to provide cooperation of partition listener and the
snapshot output process. The idea is the following:
* "snapshot reader" iterates the partition
* if there's a concurrent load, we "notify" the reader, that certain version
chains are updated, and their previous state must be preserved
Both of these steps take exclusive lock to process data. This is not optimal.
There is (almost) a way around it: persisting update timestamp together with
write intents. This way snapshot will effectively be equivalent to the "scan by
timestamp", but including write intents.
One part that makes this process not optimal is updating/committing current
write intents. These two operations replace the head of the version chain,
losing its previous value, making "pure" timestamp scan impossible.
Another point is the time that we hold the lock in snapshot reader. It must be
a single lock per single entry at most. Ideally, partition listener should have
higher priority while acquiring that lock. This is easy to do with row-level
locks that we use for GC.
There is also a way to save on reading the entire version chain holding a
single lock, doing that in several sessions, and we should do this. The Justin
Bieber Problem is not a joke. The time that we hold any lock must be
predictable and limited by some known bounds.
One thing to consider: we should probably perform write intent resolution in
snapshot reader. Otherwise the receiver node will have to do it manually later,
but with no explicit notification that some of transactions are already
committed or rolled back.
h3. GC locks
--
This message was sent by Atlassian Jira
(v8.20.10#820010)