[jira] [Updated] (IGNITE-17859) Update indexes on data modifications

Alexander Lapin (Jira) Mon, 10 Oct 2022 02:19:03 -0700


     [ 
https://issues.apache.org/jira/browse/IGNITE-17859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Alexander Lapin updated IGNITE-17859:
-------------------------------------
    Description: 
h3. Motivation

For the sake of better performance and common sense, it is necessary to 
integrate indexes into data manipulation flow, which implies
 * Integrating indexes into the process of efficient evaluation keys to rowsIds 
both within pure read/scan operations and as part of modification operations in 
order to find proper rowId of a row to be modified.
 * Integrating indexes into the process of data modification itself, meaning 
that not only data should be updated but also all corresponding indexes should 
be populated with updated entries along with cleanup on transaction rollback.

Given Jira issue is about second part.

*Definition* *of Done*
 * All indexes with relevant schema version are populated with modified rows.
 * All pending index entries are removed on tx rollback.

h3. Implementation Notes
 # Seems, that it has sense to introduce new Index abstractions that will 
manage update logic internally. Something like following:
 ** Index
 ** HashIndex extends Index
 ** HashUniqueIndex extends HashIndex
 ** SortedIndex extneds Index
 ** SorteUniquedIndex extneds SortedIndex
 # In order to define which indexes to update both during update itself or 
during rollback it'll be useful to add extra parameter _schemaVersion_ to each 
operation enlisted into transaction. All in all, that will allow to select only 
proper set of indexes with relevant schema versions.
 # During transaction rollback it'll be necessary  to cleanup outdated pending 
entries that were touched during tx lifetime. 

h4. More details about *first* item.

Index itself may declare update() and other methods that will have 
index-type-specific lock management and uniqueness processing logic, e.g. for 
HashIndex.update following is suggested:
{code:java}
@Override
public CompletableFuture update(UUID txId, TxState txState, Tuple oldRow, Tuple 
newRow, VersionChain<Tuple> rowId) {
    Tuple oldVal = oldRow == Tuple.TOMBSTONE ? Tuple.TOMBSTONE : 
oldRow.select(col);
    Tuple newVal = newRow == Tuple.TOMBSTONE ? Tuple.TOMBSTONE : 
newRow.select(col);

    List<CompletableFuture> futs = new ArrayList<>();

    if (!oldVal.equals(newVal)) {
        if (oldVal.length() > 0) {
            Lock lock0 = lockTable.getOrAddEntry(oldVal);

            txState.addLock(lock0);

            futs.add(lock0.acquire(txId, LockMode.IX));

            // Do not remove bookmarks due to multi-versioning.
        }

        if (newVal.length() > 0) {
            Lock lock0 = lockTable.getOrAddEntry(newVal);

            txState.addLock(lock0);

            futs.add(lock0.acquire(txId, LockMode.IX).thenAccept(ignored0 -> {
                if (index.insert(newVal, rowId)) {
                    txState.addUndo(() -> index.remove(newVal, rowId));
                }
            }));
        }
    }

    return CompletableFuture.allOf(futs.toArray(new CompletableFuture[0]));
} {code}
Further details could be found in [https://github.com/ascherbakoff/ai3-txn-mvp]

Detailed lock management design is described in  [IEP-91-Locking 
model|https://cwiki.apache.org/confluence/display/IGNITE/IEP-91%3A+Transaction+protocol#IEP91:Transactionprotocol-Lockingmodel]
 See index related sections, e.g. for HashIndex following lock flow is 
suggested:
{code:java}
Non-unique locks
  // scan
  S_commit(key)

  // insert
  IX_commit(key)

  // delete
  IX_commit(key) {code}
h4. More details about *third* item.

Please see cleanup flow described in 
https://issues.apache.org/jira/browse/IGNITE-17673

 

It might have sense to consider all three items as separate tickets.

  was:
h3. Motivation

For the sake of better performance and common sense, it is necessary to 
integrate indexes into data manipulation flow, which implies
 * Integrating indexes into the process of efficient evaluation keys to rowsIds 
both within pure read/scan operations and as part of modification operations in 
order to find proper rowId of a row to be modified.
 * Integrating indexes into the process of data modification itself, meaning 
that not only data should be updated but also all corresponding indexes should 
be populated with updated entries along with cleanup on transaction rollback.

Given Jira issue is about second part.

*Definition* *of Done*
 * All indexes with relevant schema version are populated with modified rows.
 * All pending index entries are removed on tx rollback.

h3. Implementation Notes
 # Seems, that it has sense to introduce new Index abstractions that will 
manage update logic internally. Something like following:
 ** Index
 ** HashIndex extends Index
 ** HashUniqueIndex extends HashIndex
 ** SortedIndex extneds Index
 ** SorteUniquedIndex extneds SortedIndex
 # In order to define which indexes to update both during update itself or 
during rollback it'll be useful to add extra parameter _schemaVersion_ to each 
operation enlisted into transaction. All in all, that will allow to select only 
proper set of indexes with relevant schema versions.
 # During transaction rollback it'll be necessary  to cleanup outdated pending 
entries that were touched during tx lifetime. 

h4. More details about *first* item.

Index itself may declare update() and other methods that will have 
index-type-specific lock management and uniqueness processing logic, e.g. for 
HashIndex.update following is suggested:

 
{code:java}
@Override
public CompletableFuture update(UUID txId, TxState txState, Tuple oldRow, Tuple 
newRow, VersionChain<Tuple> rowId) {
    Tuple oldVal = oldRow == Tuple.TOMBSTONE ? Tuple.TOMBSTONE : 
oldRow.select(col);
    Tuple newVal = newRow == Tuple.TOMBSTONE ? Tuple.TOMBSTONE : 
newRow.select(col);

    List<CompletableFuture> futs = new ArrayList<>();

    if (!oldVal.equals(newVal)) {
        if (oldVal.length() > 0) {
            Lock lock0 = lockTable.getOrAddEntry(oldVal);

            txState.addLock(lock0);

            futs.add(lock0.acquire(txId, LockMode.IX));

            // Do not remove bookmarks due to multi-versioning.
        }

        if (newVal.length() > 0) {
            Lock lock0 = lockTable.getOrAddEntry(newVal);

            txState.addLock(lock0);

            futs.add(lock0.acquire(txId, LockMode.IX).thenAccept(ignored0 -> {
                if (index.insert(newVal, rowId)) {
                    txState.addUndo(() -> index.remove(newVal, rowId));
                }
            }));
        }
    }

    return CompletableFuture.allOf(futs.toArray(new CompletableFuture[0]));
} {code}
Further details could be found in [https://github.com/ascherbakoff/ai3-txn-mvp]

 

Detailed lock management design is described in  [IEP-91-Locking 
model|https://cwiki.apache.org/confluence/display/IGNITE/IEP-91%3A+Transaction+protocol#IEP91:Transactionprotocol-Lockingmodel]
 See index related sections, e.g. for HashIndex following lock flow is 
suggested:

 
{code:java}
Non-unique locks
  // scan
  S_commit(key)

  // insert
  IX_commit(key)

  // delete
  IX_commit(key) {code}
h4. More details about *third* item.

 

Please see cleanup flow described in 
https://issues.apache.org/jira/browse/IGNITE-17673


> Update indexes on data modifications
> ------------------------------------
>
>                 Key: IGNITE-17859
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17859
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Alexander Lapin
>            Priority: Major
>              Labels: ignite-3
>
> h3. Motivation
> For the sake of better performance and common sense, it is necessary to 
> integrate indexes into data manipulation flow, which implies
>  * Integrating indexes into the process of efficient evaluation keys to 
> rowsIds both within pure read/scan operations and as part of modification 
> operations in order to find proper rowId of a row to be modified.
>  * Integrating indexes into the process of data modification itself, meaning 
> that not only data should be updated but also all corresponding indexes 
> should be populated with updated entries along with cleanup on transaction 
> rollback.
> Given Jira issue is about second part.
> *Definition* *of Done*
>  * All indexes with relevant schema version are populated with modified rows.
>  * All pending index entries are removed on tx rollback.
> h3. Implementation Notes
>  # Seems, that it has sense to introduce new Index abstractions that will 
> manage update logic internally. Something like following:
>  ** Index
>  ** HashIndex extends Index
>  ** HashUniqueIndex extends HashIndex
>  ** SortedIndex extneds Index
>  ** SorteUniquedIndex extneds SortedIndex
>  # In order to define which indexes to update both during update itself or 
> during rollback it'll be useful to add extra parameter _schemaVersion_ to 
> each operation enlisted into transaction. All in all, that will allow to 
> select only proper set of indexes with relevant schema versions.
>  # During transaction rollback it'll be necessary  to cleanup outdated 
> pending entries that were touched during tx lifetime. 
> h4. More details about *first* item.
> Index itself may declare update() and other methods that will have 
> index-type-specific lock management and uniqueness processing logic, e.g. for 
> HashIndex.update following is suggested:
> {code:java}
> @Override
> public CompletableFuture update(UUID txId, TxState txState, Tuple oldRow, 
> Tuple newRow, VersionChain<Tuple> rowId) {
>     Tuple oldVal = oldRow == Tuple.TOMBSTONE ? Tuple.TOMBSTONE : 
> oldRow.select(col);
>     Tuple newVal = newRow == Tuple.TOMBSTONE ? Tuple.TOMBSTONE : 
> newRow.select(col);
>     List<CompletableFuture> futs = new ArrayList<>();
>     if (!oldVal.equals(newVal)) {
>         if (oldVal.length() > 0) {
>             Lock lock0 = lockTable.getOrAddEntry(oldVal);
>             txState.addLock(lock0);
>             futs.add(lock0.acquire(txId, LockMode.IX));
>             // Do not remove bookmarks due to multi-versioning.
>         }
>         if (newVal.length() > 0) {
>             Lock lock0 = lockTable.getOrAddEntry(newVal);
>             txState.addLock(lock0);
>             futs.add(lock0.acquire(txId, LockMode.IX).thenAccept(ignored0 -> {
>                 if (index.insert(newVal, rowId)) {
>                     txState.addUndo(() -> index.remove(newVal, rowId));
>                 }
>             }));
>         }
>     }
>     return CompletableFuture.allOf(futs.toArray(new CompletableFuture[0]));
> } {code}
> Further details could be found in 
> [https://github.com/ascherbakoff/ai3-txn-mvp]
> Detailed lock management design is described in  [IEP-91-Locking 
> model|https://cwiki.apache.org/confluence/display/IGNITE/IEP-91%3A+Transaction+protocol#IEP91:Transactionprotocol-Lockingmodel]
>  See index related sections, e.g. for HashIndex following lock flow is 
> suggested:
> {code:java}
> Non-unique locks
>   // scan
>   S_commit(key)
>   // insert
>   IX_commit(key)
>   // delete
>   IX_commit(key) {code}
> h4. More details about *third* item.
> Please see cleanup flow described in 
> https://issues.apache.org/jira/browse/IGNITE-17673
>  
> It might have sense to consider all three items as separate tickets.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-17859) Update indexes on data modifications

Reply via email to