Re: Massive state-update bottleneck at scale

Scott Blum Wed, 23 Nov 2016 15:59:38 -0800

On Wed, Nov 23, 2016 at 5:45 PM, Mark Miller <[email protected]> wrote:

> One thing is, when you reconnect after connecting to ZK, it should now
> efficiently set every core as down in a single command, not each core.

Yeah, I backported downnode, but it still actually takes a long time for
overseer to execute, and there can be a bunch of these in the queue for the
same node.

On Wed, Nov 23, 2016 at 5:53 PM, Mark Miller <[email protected]> wrote:

> In many cases other nodes need to see a progression of state changes. You
> really have to clear the deck and try to start from 0.

This is exactly the kind of detail I'm looking for.  Can you elaborate?

Unless we can come up with a better idea, my first experiment will be to
try to eliminate the "DOWN" replica state in all practical cases, relying
only on careful management of live_nodes presence.  For example, the
startup sequence (or reconnect sequence) would skip marking replicas down
and just ensure they're ACTIVE or else put them into RECOVERING, join shard
leader elections, and finally join live_nodes when that's done.

What land mines am I likely to run into or existing assumptions am I likely
to violate if I do that?

Re: Massive state-update bottleneck at scale

Reply via email to