Ivan Bessonov created IGNITE-18539:
--------------------------------------

             Summary: Implement build procedure for new indexes
                 Key: IGNITE-18539
                 URL: https://issues.apache.org/jira/browse/IGNITE-18539
             Project: Ignite
          Issue Type: Improvement
            Reporter: Ivan Bessonov


First of all, "build" is the process of moving the index from "WO" to "RW" 
state in schema definitions. Finished build creates a new schema version. This 
means that new transactions, started after a specific timestamp, will be able 
to use this index in their plans.

Now, when index is created, we also create a table somewhere that looks similar 
to this:

 
{code:java}
indexComplation.<indexId>.0 = false
...
indexComplation.<indexId>.N = false {code}
It is used to track completion of index build for each individual partition. 
Once it's built bor each partition, WO -> RW state transition may occur.

 

For each individual partition, leader/coordinator/leaseholder/whatever node 
starts iterating all rowIds in the partition and replicates the "index build 
command" for each row (or using batches). This procedure can be described with 
this pseudocode:
{code:java}
RowId curRowId = minRowId();

// Get existing row id higher or equal than the parameter.
curRowId = partition.findRowId(curRowId);

while (curRowId != null) {
    // Put that into a state machine.
    replicateIndexRowRebuild(indexId, curRowId);

    // Compute the next row id candidate.
    curRowId = increment(curRowId);

    // Get the real next row id value.
    if (curRowId != null) {
        curRowId = partition.findRowId(curRowId);
    }
}

indexBuildCompleted(indexId, partitionId);{code}
Such replication doesn't require too much traffic, and it guarantees 
consistency between replicas, which is crucial.

Once index is created and WO, all insertions should update it 
"unconditionally", independently from the build process. This gives us 
cooperation and simplifies the way we guarantee consistency between row store 
and index.

Easy optimization: during regular load, we can avoid updating WO indexes if 
rowId > lastRowId for that particular index, because it will be updated later 
anyway.

 

About that: we should store "lastUpdatedRowId" for every index that's not yet 
completed its build. We need it for 2 reasons:
 * leader might be changed
 * cluster may be restarted

In both cases we simply start where we stopped last time, instead of starting 
from scratch, like it was in Ignite 2.x.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to