[ 
https://issues.apache.org/jira/browse/IGNITE-12069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Plekhanov updated IGNITE-12069:
---------------------------------------
    Fix Version/s:     (was: 2.9)
                   2.10

> Implement file rebalancing management
> -------------------------------------
>
>                 Key: IGNITE-12069
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12069
>             Project: Ignite
>          Issue Type: Sub-task
>            Reporter: Maxim Muzafarov
>            Assignee: Pavel Pereslegin
>            Priority: Major
>              Labels: iep-28
>             Fix For: 2.10
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{Preloader}} should be able to do the following:
>  # build the map of partitions and corresponding supplier nodes from which 
> partitions will be loaded;
>  # switch cache data storage to {{no-op}} and back to original (HWM must be 
> fixed here for the needs of historical rebalance) under the checkpoint and 
> keep the partition update counter for each partition;
>  # run async the eviction indexes for the list of collected partitions;
>  # send a request message to each node one by one with the list of partitions 
> to load;
>  # wait for files received (listening for the transmission handler);
>  # run rebuild indexes async over the receiving partitions;
>  # run historical rebalance from LWM to HWM collected above (LWM can be read 
> from the received file meta page);
> h5. Stage 1. implement "read-only" mode for cache data store. Implement data 
> store reinitialization on the updated persistence file.
> h6. Tests:
>  - Switching under load.
>  - Check re-initialization of partition on new file.
>  - Check that in read-only mode
>  ** H2 indexes are not updated
>  ** update counter is updated
>  ** cache entries eviction works fine
>  ** tx/atomic updates on this partition works fine in cluster
> h5. Stage 2. Build Map for request partitions by node, add message that will 
> be sent to the supplier. Send a demand request, handle the response, switch 
> datastore when file received.
> h6. Tests:
>  - Check partition consistency after receiving a file.
>  - File transmission under load.
>  - Failover - some of the partitions have been switched, the node has been 
> restarted, rebalancing is expected to continue only for fully loaded large 
> partitions through the historical rebalance, for the rest of partitions it 
> should restart from the beginning. 
> h5. Stage 3. Add WAL history reservation on supplier. Add historical 
> rebalance triggering (LWM (partition) - HWM (read-only)).
> h6. Tests:
>  - File rebalancing under load and without on atomic/tx caches. (check 
> existing PDS-enabled rebalancing tests).
>  - Ensure that MVCC groups use regular rebalancing.
>  - The rebalancing on the unstable topology and failures of the 
> supplier/demander nodes at different stages.
>  - (compatibility) The old nodes should use regular rebalancing.
> h5. Stage 4 Eviction and rebuild of indexes.
> h6. Tests:
>  - File rebalancing of caches with H2 indexes.
>  - Check consistency of H2 indexes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to