[
https://issues.apache.org/jira/browse/IGNITE-12069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maxim Muzafarov updated IGNITE-12069:
-------------------------------------
Fix Version/s: 2.9
> Implement file rebalancing management
> -------------------------------------
>
> Key: IGNITE-12069
> URL: https://issues.apache.org/jira/browse/IGNITE-12069
> Project: Ignite
> Issue Type: Sub-task
> Reporter: Maxim Muzafarov
> Assignee: Pavel Pereslegin
> Priority: Major
> Labels: iep-28
> Fix For: 2.9
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> {{Preloader}} should be able to do the following:
> # build the map of partitions and corresponding supplier nodes from which
> partitions will be loaded;
> # switch cache data storage to {{no-op}} and back to original (HWM must be
> fixed here for the needs of historical rebalance) under the checkpoint and
> keep the partition update counter for each partition;
> # run async the eviction indexes for the list of collected partitions;
> # send a request message to each node one by one with the list of partitions
> to load;
> # wait for files received (listening for the transmission handler);
> # run rebuild indexes async over the receiving partitions;
> # run historical rebalance from LWM to HWM collected above (LWM can be read
> from the received file meta page);
> h5. Stage 1. implement "read-only" mode for cache data store. Implement data
> store reinitialization on the updated persistence file.
> h6. Tests:
> - Switching under load.
> - Check re-initialization of partition on new file.
> - Check that in read-only mode
> ** H2 indexes are not updated
> ** update counter is updated
> ** cache entries eviction works fine
> ** tx/atomic updates on this partition works fine in cluster
> h5. Stage 2. Build Map for request partitions by node, add message that will
> be sent to the supplier. Send a demand request, handle the response, switch
> datastore when file received.
> h6. Tests:
> - Check partition consistency after receiving a file.
> - File transmission under load.
> - Failover - some of the partitions have been switched, the node has been
> restarted, rebalancing is expected to continue only for fully loaded large
> partitions through the historical rebalance, for the rest of partitions it
> should restart from the beginning.
> h5. Stage 3. Add WAL history reservation on supplier. Add historical
> rebalance triggering (LWM (partition) - HWM (read-only)).
> h6. Tests:
> - File rebalancing under load and without on atomic/tx caches. (check
> existing PDS-enabled rebalancing tests).
> - Ensure that MVCC groups use regular rebalancing.
> - The rebalancing on the unstable topology and failures of the
> supplier/demander nodes at different stages.
> - (compatibility) The old nodes should use regular rebalancing.
> h5. Stage 4 Eviction and rebuild of indexes.
> h6. Tests:
> - File rebalancing of caches with H2 indexes.
> - Check consistency of H2 indexes.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)