[ https://issues.apache.org/jira/browse/IGNITE-12069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aleksey Plekhanov updated IGNITE-12069: --------------------------------------- Fix Version/s: (was: 2.9) 2.10 > Implement file rebalancing management > ------------------------------------- > > Key: IGNITE-12069 > URL: https://issues.apache.org/jira/browse/IGNITE-12069 > Project: Ignite > Issue Type: Sub-task > Reporter: Maxim Muzafarov > Assignee: Pavel Pereslegin > Priority: Major > Labels: iep-28 > Fix For: 2.10 > > Time Spent: 10m > Remaining Estimate: 0h > > {{Preloader}} should be able to do the following: > # build the map of partitions and corresponding supplier nodes from which > partitions will be loaded; > # switch cache data storage to {{no-op}} and back to original (HWM must be > fixed here for the needs of historical rebalance) under the checkpoint and > keep the partition update counter for each partition; > # run async the eviction indexes for the list of collected partitions; > # send a request message to each node one by one with the list of partitions > to load; > # wait for files received (listening for the transmission handler); > # run rebuild indexes async over the receiving partitions; > # run historical rebalance from LWM to HWM collected above (LWM can be read > from the received file meta page); > h5. Stage 1. implement "read-only" mode for cache data store. Implement data > store reinitialization on the updated persistence file. > h6. Tests: > - Switching under load. > - Check re-initialization of partition on new file. > - Check that in read-only mode > ** H2 indexes are not updated > ** update counter is updated > ** cache entries eviction works fine > ** tx/atomic updates on this partition works fine in cluster > h5. Stage 2. Build Map for request partitions by node, add message that will > be sent to the supplier. Send a demand request, handle the response, switch > datastore when file received. > h6. Tests: > - Check partition consistency after receiving a file. > - File transmission under load. > - Failover - some of the partitions have been switched, the node has been > restarted, rebalancing is expected to continue only for fully loaded large > partitions through the historical rebalance, for the rest of partitions it > should restart from the beginning. > h5. Stage 3. Add WAL history reservation on supplier. Add historical > rebalance triggering (LWM (partition) - HWM (read-only)). > h6. Tests: > - File rebalancing under load and without on atomic/tx caches. (check > existing PDS-enabled rebalancing tests). > - Ensure that MVCC groups use regular rebalancing. > - The rebalancing on the unstable topology and failures of the > supplier/demander nodes at different stages. > - (compatibility) The old nodes should use regular rebalancing. > h5. Stage 4 Eviction and rebuild of indexes. > h6. Tests: > - File rebalancing of caches with H2 indexes. > - Check consistency of H2 indexes. -- This message was sent by Atlassian Jira (v8.3.4#803005)