Hello,

I have a cluster with 1 control node (4 CPU, 4 GB RAM, 8 GB disk) and 1
worker node (24 vCPU, 48 GB RAM, 500 GB disk).

Whenever I stop this cluster, it takes around 30 minutes to shut down, and
similarly 30 minutes to start back up. In comparison, my other clusters —
which have multiple worker nodes (4 CPU, 8 GB RAM, 50 GB disk) — start and
stop in under 15 minutes.

I can see that the instances themselves shut down, but the cluster status
remains in “Stopping” for a long time. Similarly, when starting the
cluster, the instances come up quickly, but the cluster status stays in
“Starting” even after the instances are already running.

I tried deleting and recreating this cluster with the same specifications,
but the behavior remains the same. I also tried placing both the control
node and worker node on the same host machine, but there was no improvement.

Could someone please suggest how I can further investigate and identify the
root cause of this delay?
-- 
With Regards,
Nixon Varghese

Reply via email to