This is a good question. We have the same problem in Pinot and we solved it
using a shutdownInProgress flag in instanceConfig znode, spectator will
look into this flag and stop routing queries to that node. We avoided using
disable instance solution.

The solution is as follows
- the participant sets shutdownInProgress=true in InstanceConfig in its
shutdownHook
- Broker routing table gets updated because it listens to changes in
instanceConfig.
- the routingTableProvider treats this node as disabled if it sees this
flag. (you need to extend routingtableprovider)
- when the participant restarts, as part of registering itself in Helix, it
sets the shudownInProgress=false.

This is a valid feature and potentially be added to Helix.

thanks,
Kishore G


On Wed, Apr 18, 2018 at 10:44 AM, Bo Liu <[email protected]> wrote:

> Hi folks,
>
> We are running a service managed by Helix. When we rolling restart the
> service, we first disable instances through Helix before restarting the
> service processes in the hope that the read errors are minimized.
>
> However, the instances being restarted may get Online->Offline messages
> before clients get the latest version of the external view. I am wondering
> if there is any way to delay the Online->Offline messages generated by
> instance "disable" command?
>
> --
> Best regards,
> Bo
>
>

Reply via email to