Hey guys,

I'm troubleshooting some issues with our cluster under some production load
and scaling.

If we add new drillbits to a cluster, as soon as it joins the cluster,
performance degrades severely (queries that usually take 1s would take 60s,
for example).  After a few minutes, it recovers just fine and all is normal
again.

What I assume is happening is that the new drillbit is still initializing
or "warming up" but already made itself available to start taking work.
This means that queries would end up waiting for this drillbit to
initialize before the query returns.

I haven't confirmed this in the profiles as yet (as we have a fair bit of
load so I haven't isolated the individual long-running queries), but I'll
keep investigating.

In the mean time, does that theory sound possible?  And if so, what
initialization/warm up is the drillbit doing?  Furthermore, could we not
delay it joining the cluster for active work until it is completely ready
to undergo the work?

We're considering running some sort of autoscaling to handle varying load,
so this would be really crucial for us!

Any thoughts or pointing me in the right direction would be great.

Reply via email to