Hello! I'm currently on a project that uses Apache Helix 0.8.4 (with a pending upgrade to Helix 1.0.1) to distribute partitions across a number of hosts (currently 32 partitions, 16 hosts). Once a partition is allocated to a host a bunch of expensive initialization steps occur, and the system proceeds to do a bunch of computations for the partition on a scheduled interval. We seek to minimize initializations when possible.
If a system goes down (due to either maintenance or failure), the partitions get reshuffled. Currently we are using the CrushEdRebalanceStrategy in the hopes of minimizing partition movements. However, we noticed that unlike the earlier AutoRebalancer scheme, the CrushEdRebalanceStrategy does not limit the number of partitions per node. In our case, this can cause severe out-of-memory issues, which will then cascade as node after node gets more and more partitions that it cannot handle. We have on rare occasion seen our entire cluster fail as a result, and then our production engineers must manually - and carefully - bring the system back online. This is undesirable. Does Helix have a rebalancing strategy that minimizes partition movement yet also permits enforcement of maximum partitions per node? Thanks, - Phong X. Nguyen
