I've noticed that partitions/replicas assigned to disconnected instances
are not automatically redistributed to live instances. What's the correct
way to do this?

For example, given this setup with Helix 0.6.5:
- 1 resource
- 2 replicas
- LeaderStandby state model
- FULL_AUTO rebalance mode
- 3 nodes (N1 is Leader, N2 is Standby, N3 is just sitting)

Then drop N1:
- N2 becomes LEADER
- Nothing happens to N3

Naively, I would have expected N3 to transition from Offline to Standby,
but that doesn't happen.

I can force redistribution from GenericHelixController#onLiveInstanceChange
by
- dropping non-live instances from the cluster
- calling rebalance

The instance dropping seems pretty unsafe! Is there a better way?

Reply via email to