[
https://issues.apache.org/jira/browse/IGNITE-9592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladimir Ozerov closed IGNITE-9592.
-----------------------------------
> MVCC: Use linked lists to store multiple versions.
> --------------------------------------------------
>
> Key: IGNITE-9592
> URL: https://issues.apache.org/jira/browse/IGNITE-9592
> Project: Ignite
> Issue Type: Improvement
> Components: mvcc
> Reporter: Roman Kondakov
> Priority: Major
> Labels: performance
>
> Currently we store all versions of each row in primary index. It is not very
> efficient for several reasons:
> * We have to insert and remove tree item each time a new version is created
> or an old version deleted. This leads to a considerable tree operations
> number increasing as well as extra tree splits and merges operations.
> * Also this leads to a contention on leaf page write lock - each update
> operation has to obtain exclusive access to insert a new version or remove
> old version of row. During this update no body on that leaf can not be able
> to update or even read data of neighbour keys.
> * Primary key tree consumes more space if it stores each version.
> * Other vendors do not store each version in primary indexes (as far as I
> know).
> Possible solution - store only key and link to the newest version in the
> primary index. Instead of this {{CacheDataTree}} item
> {{| key part | | |}}
> {{|-----------------------------| lockVer | link |}}
> {{| cache_id | hash | mvccVer | | |}}
> we'll have:
> {{| key part | | link to the |}}
> {{|-----------------| lockVer | newest |}}
> {{| cache_id | hash | | version |}}
> Note, we do not have mvccVer in tree item. Link to the newer version leads
> to the most recent inserted data item. To find older versions, each DatRow is
> provided with "prevLink" - the link to previous version of row. DataRow
> layout can be changed from
> |header|xid_max|xid_min |KV bytes|
> to the next one:
> |header|xid_max|xid_min|*PREV_LINK*|KV bytes|
> Where *PREV_LINK* field points to the previous version of the row. Doing
> traverse overt this prevRow links we can iterate over all available versions
> without using and affecting the primary key tree.
> When the new version is created we just insert the new row in datastore, then
> do CAS on the link to the newest version in primary key tree in order it
> points to the new row. PrevLink in the new row should point on the previous
> one. That is all. We've just inserted a new version just with a long field
> CAS in the CacheDataTree. Without obtaining write lock and other headache.
> Secondary indexes are handled in the same manner as before.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)