[jira] [Updated] (IGNITE-17673) Extend MV partition storage API with methods to help cleaning up SQL indices

Ivan Bessonov (Jira) Tue, 13 Sep 2022 07:30:21 -0700


     [ 
https://issues.apache.org/jira/browse/IGNITE-17673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ivan Bessonov updated IGNITE-17673:
-----------------------------------
    Description: 
h3. TLDR;

Following method should be added to MvPartitionStorage:
{code:java}
Cursor<BinaryRow> scanVersions(RowId rowId); {code}
h3. Details

In order to allow indices to be cleaned, we need extra API in partition storage.

In pseudo-code, cleanup should look like following:
{code:java}
BinaryRow oldRow = partition.addWrite(rowId, txId, partitionId, newRow);

if (oldRow != null) {
    Set<Index> allIndexes = getAllIndexes();

    for (BinaryRow version : partition.scanVersions(rowId)) {
        for (Index index : allIndexes) {
            if (index.rowsMatch(oldRow, version)) {
                allIndexes.remove(index);
            }
        }

        if (allIndexes.isEmpty()) {
            break;
        }
    }

    for (Index index : allIndexes) {
        index.remove(oldRow);
    }
}{code}
Now, I guess I need to explain this a little bit.

First of all, the real implementation will probably look a bit different. 
Cursor has to be closed, oldRow must be converted to a binary tuple. Rows 
matching algorithm shouldn't be in the index itself, because it depends on 
versioned row schemas and indexes don't know about them. Having a set and 
removing from it doesn't look optimal either. Etc. This is just a sketch.

Second, from the API standpoint for getting versions for a single key, it's 
pretty accurate to what I imagine:
{code:java}
Cursor<BinaryRow> scanVersions(RowId rowId);{code}
Versions should be returned from newest to oldest. Timestamp itself doesn't 
seem to be necessary.

 

  was:
In order to allow indices to be cleaned, we need extra API in partition storage.

In pseudo-code, cleanup should look like following:
{code:java}
BinaryRow oldRow = partition.addWrite(rowId, txId, partitionId, newRow);

if (oldRow != null) {
    Set<Index> allIndexes = getAllIndexes();

    for (BinaryRow version : partition.scanVersions(rowId)) {
        for (Index index : allIndexes) {
            if (index.rowsMatch(oldRow, version)) {
                allIndexes.remove(index);
            }
        }

        if (allIndexes.isEmpty()) {
            break;
        }
    }

    for (Index index : allIndexes) {
        index.remove(oldRow);
    }
}{code}
Now, I guess I need to explain this a little bit.

First of all, the real implementation will probably look a bit different. 
Cursor has to be closed, oldRow must be converted to a binary tuple. Rows 
matching algorithm shouldn't be in the index itself, because it depends on 
versioned row schemas and indexes don't know about them. Having a set and 
removing from it doesn't look optimal either. Etc. This is just a sketch.

Second, from the API standpoint for getting versions for a single key, it's 
pretty accurate to what I imagine:
{code:java}
Cursor<BinaryRow> scanVersions(RowId rowId);{code}
Versions should be returned from newest to oldest. Timestamp itself doesn't 
seem to be necessary.

 


> Extend MV partition storage API with methods to help cleaning up SQL indices
> ----------------------------------------------------------------------------
>
>                 Key: IGNITE-17673
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17673
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Ivan Bessonov
>            Priority: Major
>              Labels: ignite-3
>
> h3. TLDR;
> Following method should be added to MvPartitionStorage:
> {code:java}
> Cursor<BinaryRow> scanVersions(RowId rowId); {code}
> h3. Details
> In order to allow indices to be cleaned, we need extra API in partition 
> storage.
> In pseudo-code, cleanup should look like following:
> {code:java}
> BinaryRow oldRow = partition.addWrite(rowId, txId, partitionId, newRow);
> if (oldRow != null) {
>     Set<Index> allIndexes = getAllIndexes();
>     for (BinaryRow version : partition.scanVersions(rowId)) {
>         for (Index index : allIndexes) {
>             if (index.rowsMatch(oldRow, version)) {
>                 allIndexes.remove(index);
>             }
>         }
>         if (allIndexes.isEmpty()) {
>             break;
>         }
>     }
>     for (Index index : allIndexes) {
>         index.remove(oldRow);
>     }
> }{code}
> Now, I guess I need to explain this a little bit.
> First of all, the real implementation will probably look a bit different. 
> Cursor has to be closed, oldRow must be converted to a binary tuple. Rows 
> matching algorithm shouldn't be in the index itself, because it depends on 
> versioned row schemas and indexes don't know about them. Having a set and 
> removing from it doesn't look optimal either. Etc. This is just a sketch.
> Second, from the API standpoint for getting versions for a single key, it's 
> pretty accurate to what I imagine:
> {code:java}
> Cursor<BinaryRow> scanVersions(RowId rowId);{code}
> Versions should be returned from newest to oldest. Timestamp itself doesn't 
> seem to be necessary.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-17673) Extend MV partition storage API with methods to help cleaning up SQL indices

Reply via email to