Team, The time required to undertake a rebalance of a geode cluster has often been an area for improvement noted by users. Currently, buckets are moved one at a time and we propose that creating a system that moved buckets in parallel could greatly improve performance for this feature.
Previously, parallelization was implemented for adding redundant copies of buckets to restore redundancy. However, moving buckets is a more complicated matter and requires a different approach than restoration of redundancy. The reason for this is that members could be potentially both be gaining buckets and giving away buckets at the same time. While giving away a bucket, that member still has all of the data for the bucket, until the receiving member has fully received the bucket and it can safely be removed from the original owner. This means that unless the member has the memory overhead to store all of the buckets it will receive and all the buckets it started with, there is potential that parallel moving of buckets could cause the member to run out of memory. For this reason, we propose a system that does (potentially) several rounds of concurrent bucket moves: 1) A set of moves is calculated to improve balance that meet a requirement that no member both receives and gives away a bucket (no member will have memory overhead of an existing bucket it is ultimately removing and a new bucket). 2) Conduct all calculated bucket moves in parallel. Parameters to throttle this process (to prevent taking too many cluster resources, impacting performance) should be added, such as only allowing each member to either receive or send a maximum number of buckets concurrently. 3) If cluster is not yet balanced, perform additional iterations of calculating and conducting bucket moves, until balance is achieved or a possible maximum iterations is reached. Note: in both the existing and proposed system, regions are rebalanced one at a time. Please let us know if you have feedback on this approach or additional ideas that should be considered.