[ 
https://issues.apache.org/jira/browse/IGNITE-8020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Lantukh updated IGNITE-8020:
---------------------------------
    Description: 
Existing rebalancing protocol is suitable for in-memory data storage, but for 
data persisted in files it is sub-optimal and requires a lot of unnecessary 
steps. Efforts to optimize it led to necessity to completely rework the 
protocol - instead of sending batches (SupplyMessages) with cache entries it is 
possible to send data files directly.

The algorithm should look like this:
1. Demander node sends requests with required partition IDs (like now)
2. Supplier node receives request and performs a checkpoint.
3. After checkpoint is done, supplier sends files with demanded partitions 
using low-level NIO API.
4. During steps 2-3, demander node should work in special mode - it should 
temporary store all incoming updates in such way that they can be quickly 
applied later.
5. After files are transferred, demander applies updates stored at step 4.

The tricky part here is to switch work modes of demander node avoiding all 
possible race conditions. Also, the aforementioned algorithm should be extended 
to transfer or rebuild query indexes.

> Rebalancing for persistent caches should transfer file store over network 
> instead of using existing supply/demand protocol
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-8020
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8020
>             Project: Ignite
>          Issue Type: Improvement
>          Components: persistence
>            Reporter: Ilya Lantukh
>            Assignee: Ilya Lantukh
>            Priority: Major
>
> Existing rebalancing protocol is suitable for in-memory data storage, but for 
> data persisted in files it is sub-optimal and requires a lot of unnecessary 
> steps. Efforts to optimize it led to necessity to completely rework the 
> protocol - instead of sending batches (SupplyMessages) with cache entries it 
> is possible to send data files directly.
> The algorithm should look like this:
> 1. Demander node sends requests with required partition IDs (like now)
> 2. Supplier node receives request and performs a checkpoint.
> 3. After checkpoint is done, supplier sends files with demanded partitions 
> using low-level NIO API.
> 4. During steps 2-3, demander node should work in special mode - it should 
> temporary store all incoming updates in such way that they can be quickly 
> applied later.
> 5. After files are transferred, demander applies updates stored at step 4.
> The tricky part here is to switch work modes of demander node avoiding all 
> possible race conditions. Also, the aforementioned algorithm should be 
> extended to transfer or rebuild query indexes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to