Ivan Bessonov created IGNITE-18031:
--------------------------------------

             Summary: Implement background GC process for MV partition storages
                 Key: IGNITE-18031
                 URL: https://issues.apache.org/jira/browse/IGNITE-18031
             Project: Ignite
          Issue Type: Improvement
            Reporter: Ivan Bessonov


Please refer to Epic for more details. Here I only describe thoughts about 
background GC process.
h3. General thoughts

Basic algorithm is the following:

 
{code:java}
RowId rowId = null;

while (!partitionStopped) {
    if (rowId == null) rowId = minRowId(partId);

    rowId = partition.closestRowId(rowId);

    partition.gc(rowId);

    rowId = rodId.increment();
}{code}
Here I ignore a lot of technicalities and only show the main loop. We _could_ 
implement it literally the same way, but that would be a very bad decision. Why:
 * it fully utilizes a single thread. That would mean a thread per each 
partition, which is unacceptable
 * it constantly reads the entire partition over and over again. I don't like 
that, it's a waste of resources, we'd rather prioritize reading data that user 
needs. There should be pauses between full runs, at least

To address all of these issues, the job should be split into small batches 
(like we do in many other places) and every new batch should be put into a pool 
only when the current one is completed. This allows multiple partitions to 
utilize the same pool without a possibility of starvation.

The part with the _pause_ should probably involve a scheduled pool. That seems 
enough for the first implementation.

Other ideas and approaches will be discussed in other issues.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to